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PRELIMINARY AMENDMENT 



Assistant Commissioner for Patents 
Washington, D.C. 20231 

Dear Sir: 

Prior to the examination of the above-identified apphcation kindly amend the application 
as follows. 

IN THE SEOUENCE LISTING 

Please replace page 1 of the Sequence Listing in the published PCT application with the 
enclosed substitute page 1. 
IN THE CLAIMS : 

Please cancel Claims 1-85. 

Please add new Claims 86-12L 

86. A method of obtaining a plurality of biallehc markers comprising the steps of: 

a) obtaining a nucleic acid library comprising a plurality of genomic DNA 
fragments comprising the fiill genome or a portion thereof; 
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b) determining the order of said plurality of genomic DNA fragments in the 
genome; 

c) determining the sequence of selected regions of said plurality of genomic 
DNA fragments; and 

d) identifying nucleotides in said plurality of genomic DNA fragments which 
vary between individuals, thereby defining a set of biallelic markers. 

87. The method of Claim 86, further comprising selecting a minimally overlapping 
set of genomic fragments from said nucleic acid library. 

88. The method of Claim 86, further comprising identifying one bialleUc marker per 
genomic DNA fragment. 

89. The method of Claim 86, further comprising identifying two or more biallehc 
markers per genomic DNA fragment. 

90. The method of Claim 86, further comprising detecting a set of biallelic markers 
having a desired average heterozygosity rate. 

91. The method of Claim 86, further comprising selecting biallelic markers having a 
heterozygosity rate of at least about 0.18. 

92. The method of Claim 86, further comprising selecting biallelic markers having a 
heterozygosity rate of at least about 0.32. 

93. The method of Claim 86, further comprising selecting biallelic markers having a 
heterozygosity rate of at least about 0.42. 

94. The method of Claim 86, wherein said identifying step comprises identifying at 
least about 20,000 biallehc markers. 

95. The method of Claim 86, wherein the step of determining the sequence of selected 
regions of said pluraHty of genomic DNA fragments comprises inserting fragments of said 
plurality of genomic DNA fragments into a vector to generate a plurality of subclones and 
determining the sequence of a region of the inserts in said plurality of subclones or a subset 
thereof. 

96. The method of Claim 86, wherein a set of about 10,000 to about 30,000 genomic 
DNA inserts with an average size between lOOkb and BOOkb are ordered. 
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97. The method of Claim 86, further comprising determining the position of said 
bialleUc markers along the genome or a portion thereof 

98. The method of Claim 86, further comprising obtaining pluralities of biallelic 
markers such that each marker is in linkage disequilibrium with at least one of identified 
markers. 

99. The method of Claim 86, wherein said portion of the genome comprises at least 
200 kb of contiguous genomic DNA. 

100. The method of Claim 86, wherein said portion of the genome comprises at least 2 
Mb of contiguous genomic DNA. 

101. The method of Claim 86, wherein said portion of the genome comprises at least 
20 Mb of contiguous genomic DNA. 

102. The method of Claim 86, further comprising the step of identifying one or more 
groups of biallehc markers which are in proximity to one another in the genome. 

103. The method of Claim 86, further comprising the step of identifying one or more 
groups of biallehc markers which are in proximity to one another in the genome, wherein the 
biallehc markers in each of these groups are located within a genomic region spanning from 1 to 



104. The method of Claim 86, further comprising the step of identifying one or more 
groups of biallelic markers which are in proximity to one another in the genome, wherein the 
biallehc markers in each of these groups are located within a genomic region sparming from 50 
to 150kb. 

105. The method of Claim 86, further comprising the step of identifying one or more 
groups of biallehc markers which are in proximity to one another in the genome, wherein the 
biallelic markers in each of these groups are located within a genomic region spanning more than 
1Mb. 

106. A set of biallehc markers obtained by the method of Claim 86, wherein the 
markers in said set are on average evenly spaced over the full genome or a portion thereof 

107. The set of bialleUc markers of Claim 106, wherein the markers in said set are 
ordered relative to one another. 

108. The set of biallelic markers according to Claim 106 or Claim 107, wherein the 
markers in said set have a known genomic position. 



5kb. 
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109. The set of biallelic markers of Claim 106, wherein said biallehc markers are 
separated from one another by an average distance of 100 to 150kb. 

110. The set of biallelic markers of Claim 106, wherein said biallelic markers are 
separated from one another by an average distance of 25 to 50kb. 

111. The set of biallehc markers of Claim 106, wherein said biallelic markers are 
separated from one another by an average distance of 10 to 200kb. 

112. The set of biallehc markers of Claim 106, wherein said biallelic markers have a 
heterozygosity rate of at least about 0.18. 

113. The set of biallelic markers of Claim 106, wherein said biallehc markers have a 
heterozygosity rate of at least about 0.32. 

114. The set of biallelic markers of Claim 106, wherein said biallehc markers have a 
heterozygosity rate of at least about 0.42. 

115. A map comprising an ordered array of at least 20,000 biallehc markers obtained 
by the method of Claim 86. 

116. A method of identifying one or more biallehc markers associated with a 
detectable trait comprising the steps of: 

a) determining the frequencies of each allele of said one or more biallehc 
markers obtained by the method of Claim 86 in individuals who express said detectable 
trait and individuals who do not express said detectable trait; and 

b) identifying one or more alleles of said one or more biallelic markers which 
are statistically associated with the expression of said detectable trait. 

117. A method of identifying one or more biallelic markers associated with a 
detectable trait comprising the steps of: 

a) selecting a gene in which mutations result in a detectable trait or a gene 
suspected of being associated with a detectable trait; and 

b) identifying one or more biallelic markers obtained by the method of 
Claim 86 within the genomic region harboring said gene which are associated with said 
detectable trait. 

118. A method for determining whether an individual is at risk of developing a 
detectable trait or suffers from a detectable trait associated with said trait comprising the steps of: 



a) 



obtaining a nucleic acid sample from said individual; 
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b) screening said nucleic acid sample with one or more biallelic markers 
obtained by the method of Claim 86; and 

c) determining whether said nucleic acid sample contains one or more of 
biallelic markers statistically associated with said detectable trait. 

119. A method of using a drug comprising: 

a) obtaining a nucleic acid sample from an individual; 

b) determining the identity of the polymorphic base of one or more biallelic 
markers obtained by the method of Claim 86 which is associated with a positive response 
to treatment with said drug or one or more biallehc markers obtained by the method of 
Claim 86 which is associated with a negative response to treatment with said drug; and 

c) administering said drug to said individual if said nucleic acid sample 
contains one or more biallelic markers associated with a positive response to treatment 
with said drug or if said nucleic acid sample lacks one or more biallelic markers 
associated with a negative response to said drug. 

120. A method of selecting an individual for inclusion in a clinical trial of a drug 



a) obtaining a nucleic acid sample from an individual; 

b) determining the identity of the polymorphic base of one or more bialleUc 
markers obtained by the method of Claim 86 which is associated with a positive response 
to treatment with said drug or one or more biallelic markers associated with a negative 
response to treatment with said drug in said nucleic acid sample; and 

c) including said individual in said cUnical trial if said nucleic acid sample 
contains one or more biallehc markers obtained by the method of Claim 86 which is 
associated with a positive response to treatment with said drug or if said nucleic acid 
sample lacks one or more biallelic markers associated with a negative response to said 
drug. 

121. A method of identifying a gene associated with a detectable trait comprising the 



a) determining the frequency of each allele of one or more biallelic markers 
obtained by the method of Claim 86 in individuals having said detectable trait and 
individuals lacking said detectable trait; 



compnsmg; 



steps of: 



-5- 



AppL No. 
Filed 



Unknown 




January 14, 2000 



b) identifying one or more alleles of one or more biallelic markers having a 
statistically significant association with said detectable trait; and 

c) identifying a gene in linkage disequilibrixmi with said one or more alleles. 



Page 1 of the Sequence Listing has been amended to provide the names of the inventors 
as the apphcants according to U.S. procedure rather than Usting the assignee as the appUcant. 
The remainder of the Sequence Listing is identical to the Sequence Listing in the PCT 
Application. Accordingly, the amendments to the Sequence Listing do not introduce any new 
matter. 

If the Examiner has any questions regarding the above amendments, he is cordially 
invited to contact the undersigned so that any such questions may be promptly resolved. 



REMARKS 



Respectfully submitted, 



KNOBBE, MARTENS, OLSON & BEAR, LLP 





Daniel Hart 

Registration No. 40,637 

Attomey of Record 

620 Newport Center Drive 

Sixteenth Floor 

Newport Beach, CA 92660 

(619) 235-8550 



AMEND 
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BIALLEUC MARKERS FOR USE IN CQNSTEUCTING A HIGHDENSl] ^ DlSEQUlLIBR IUjajA^^ HUMAN GENOME 

Background of Ihe Invention 

Accent advances in genetic engineering and biomformatics have enabled tiic nianipulation and cfiarnclGrization 
5 of large portions of the human flcnmno. Wlule effcrts to obtain the full sequence of the human genome arc rapidly 

progrc5sino, there arc many practical uses for genetic information which can be implemented with partial knowledge of 
tho sequence of the human genome. 

As ihtt full sequence of tho human genome is assembled, the partial scqucncu informaiton available can be used 
to identify genes responsible for detuciabls human traits, such as genes associated with human diseases, and to develop 
10 diagnostic tests capable of identifying individuals w!io express a delectable trait as the result of a specific genotype or 

individuals whose genotype places thum at risk of developing a datectablu trait at a subsequent time. Each uf these 
applications for partial genomic sequence information is based upon the assembly of genetic and physical maps which 
order the known genomic sequences along the human chromosomes. 

The presant invention relates to human genomic ssquonces which can be used to construct a high resolution 
15 map of the human genome, methods for constructing such a map, methods of identifying genes associated with 

detectable human traits, and diagnostics for identifying individuals who carry a gene which causes them to express a 
detectable trait or which places them at nsk of expressing a detectable trait in the future* 

Summary of the Invention 

20 A first embodiment of the present invention is a method of obtaining a set of biaileCc markers comprising the 

steps of obtaining a nucleic acid library comprising a plurality of genomic DNA f ragment^ comprising the full geriDme or a 
portion thereof, determining the order of said plurality of genomic DNA fragments in the genome, determining tho 
sequence of selected regions of said plurality of fienomic DNA fragments, and identifying nucleotides in said plurality of 
genomic DMA fragments which vary between individuals, thereby defining a set of biallelic markers. 
25 In one aspect of ihis first embodiment, the identifying step comprises identifying about 20,000 biallelic 

markers. In another aspect of this first embodiment, the identifying step comprises identifying about 40,000 biallBltc 
markers. In a further aspect of this embodiment, the identifying step comprises identrfying about 60,000 biallelic 
markers. In still another aspect of this first embodiment, the identifying step comprises identifying about 80,ODD 
bialleRc markers. . in still another aspect of this first embodiment, the identifying step comprises identifying about 
30 100,000 biallelic markers. . In stilt another aspect of this fir^t embodimen the identifying step comprises identifying 
about 120,000 biallstic markers. 

In still another aspect of this first embodiment, the biallelic markers are separated from one another by an 
average distance of IOkb-200 kb. . In still another aspect of this first embodiment, the biallelic markers are separated 
from one anothar by an average distant^ of 15kb-150 kb, in still another aspect of this first embodiment the biallelic 
35 markers are separated from one another by an average distance of 20kb'10O kb- . In still another aspect of this first 
embodiment, thetiallelic markers are separated from one another by an average distance of lOOkb-'TSO kb. In still 
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another aspect of this first embodiment, the biallelic markers are separated from one another by an average distance of 
50-10QIcb. . In still another aspect of this first embodiment, the bialleiic markers are separated from one another fay an 
average distance of 25 kb*50 kh. 

In stiii another aspect of this first embodiment, the step of detBrmining the sequence of selected regions of 
sairl plurality of ooncmic DNA fragments comprisBs inscrlino fragments of said plurality of genomic DNA fragments into 
a vector to gcncrats a plurafily of subclones and detcrminino the sequence uf a region of the inserts in said plirrnlity of 
subclones or a subset thereof. For example, in this aspect of the first embodiment, the step of determining the seqiicrtca 
of a region of soid inserts or a subset thereof may cemprise determining the sequence of one or both end rcyions of said 
inserts or 3 suljsct thereof. In this aspect of the first embodiment, the step of determining the sequence of one or boih 
end regions of said plurality of subclones comprises dcterminino the sequence of about 500 bases at eacli end of said 
subclones or a subset thereof. 

In ^ili another aspect of this first embodiment, a set of about 10,000 to about 20.000 genomic DNA inserts 
with an average size between IDOkb and 300kh arc ordered. In still another aspect of tliis first embodiment, a set of 
about 10,000 to about 30.000 genomic DNA inserts with an averoge size between lOOkb and 150 kb are ordered. In 
still another aspect of tfiis first embodiment, a set of about 15,000 to about 25,000 genomic DNA inserts with an 
average Size between IQOkb and 200 Icb are ordered. 

in still another aspect of this first Embodiment, the identifying step comprises identifying hGlween 1 and G 
biallelic markers per genomic DNA fragment. In still another aspect of this first embodiment, the identifying step 
comprises identifying an average of 3 biallelic markers per genomic DNA insert. 

In still anotfier aspect of this first embodiment the genomic DMA fragments are in a Bacterial Artificial 
Chromosome, In still another aspect of this first embodiment the genomic DNA fragments are in a Yeast Artificial 
Chromosomc- 

In still another aspect of this first embodiment, the method further comprises determining the position of said 
biallelic markers along the genome or a portion thereof. In this aspect of the first embodiment, the step of determining 
the position of said biallefic markers along the genome or portion Ihereoi may comprise determining the position of said 
biallelic markers along a chromosome. In this aspect of the first embodiment, the step of determining the posrtion of 
said biallelic markers along the genome or portion thereof comprises datErmining the position of said biallelic markers 
along a subchromosomal region. 

In still another aspect of this first embodiment, the metliod further comprises identifying biallelic markers 
which are in Mage disequilibrium with one another. In this aspect of the first embodiment, the method may funhar 
comprise optimizing the intermarker spacing between said biallelic markers such that each identified marker is in linkage 
dlsequillibrium with at least one other identified marker* 

In still another aspect of this first embodiment, the portion of the genome comprises at least 200 kb of 
cmrtiguous genomic DNA. In still another aspect of this first embodiment, the portion of the genome comprises at least 
300 kb of contiguous genomic DNA. In still another aspect of this first embodiment, the portion of the genome 
comprises at least SOffkb of contiguous genomic BI^A. In still another aspect oT lihls lirst embodiment, the portion of the 
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flcnomB comprises at least 2 Mb of contiguous Benomic DMA. In still anothDf aspect of this first embodiniGnt, the portion 
of thG genome comprises at least 5 Mb of contiguous genomic DNA. In still anothsr aspect of this first cmbodirnGnt. the 
portion of the genome comprises at least 10 Mb of contiBUOUS genomic DNA. In still snother aspect of this first 
embodiment, the portion of tlie genome comprises pt least 20 Mb of contiguous gcnG[nic ONA, 

In still another aspect of this first embodiment, the method further comprises the step cl irientifyino one or 
more groups of bialtclic markers which arc in proximity to one another in the genome. In this aspect of the first 
embodimont. the biallotic markers in each of these groups may ba locatud within a oenunuc region spanning less tlian 
Ikb. Alternatively, in tliis aspect of the first embodiment, the biallclic markers in each of these gfotjps may be located 
within a genomic region s|ianning from 1 to 5kb, Alternatively, in this aspect of the first embodiment, the biallclic markers 
in each of these groups may be locatod within a genomic region spanning from 5 to lOkb. Alternatively, in this aspect of 
the first embodiment, the bialielic markers in each of these groups may be located within a gonomic region spanning from 
10 to 25kb. Alternatively, in tins aspect of the first embodimcnl the bialielic markers in each of these groups may be 
locnted within a genomic region spanning from 25 to 50kb* Altcmalivcly, in this aspect of the first embodiment, tho 
bialleiic markers in each of these groups may be located within a genomic region spanning from 50 to ISOkb. 
Alternatively, in this aspect of the first embodiment the biallufic markers in each of these groups may be lotatiid wirhin a 
genomic region spanning from 150 to 25Qkb. Alternalivcty, in Uiis aspect of the first embodiment, ths biallclic markers in 
each of these groups may be located within a genomic region spanning from 250 to 500kb, Allemativeiy, in this aspect ol 
tho first embodiment, the biallclic markers in each of these groups may be iocnted within a genomic region spanning from 
SOQkb to 1Mb. Aitemalively, in this aspect of the first embodiment, the bialleiic markers in each of these groups may be 
located within a genomic region spanning more i\m 1Mb, 

A second embodiment of the present invention is a method of obtaining a s^t pf bialleiic markers comprising the 
steps of obtaining a nucleic add library comprising ocnomic ONA fragments comprising the full genome or a portion 
thereof, detemnining the sequence of selected regions of said genomic DNA fragments, identifying nucleotides in said 
genomic DNA fragments wliich vary between individuals, thereby defining a set of bialleiic markers, and 
determining the order of said bialleiic markers along the genome or portion thereof. 

A third embodiment of the present invention is a set of bialtolic markers obtained by the method of the first 
embodiment ki one aspect of this third embodiment the markers in said set have a known genomic position. In another 
aspect of this third embodiment, the markers in said set have a known genomic relationship to one another. 

A fourth embodiment of the present invention is a set of bialleiic markers having a known relationship to one 
another and a known genomic position, said set of bialleiic markers being obtained by the method of the first 
embodiment. In one aspect of this fourth embodiment the bialieBc markers have heterozygosity rates of at least about 
0,18. In another aspect of this fourth embodiment the bialleiic markers have heterozygosity rate of at least about 0.32, 
In stilt another aspect of this fourth embodiment, the bialleiic markers have a heterozygosity rate of at least about 0,42. 

A fifth embodiment of the present invention is a map comprising an ordered array of at least 20,000 bialleiic 
markers obtained by the method of the first embodiment In one aspect of this fifth embodiment, the map comprises an 
ordered array of at least G0,000 bialleiic markers obtained by the method of the first embodiment In another aspect of 
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this fifth embodiment the map comprises an ordered array of at least 120,000 bialielic markers obtained by tlie method 
ai tho first embodiment 

In another aspect of this fifth cmbodlmont blallcHc markers are distributed at an average marker density of 
one marker every 150kb, In a further aspect of this fifth embodiment the biallulic markers are distributed at an average 
marker density of one marker every 50 kb. In a further aspect of this fifth Bmbodimcnt, the biallslic markers are 
distribatcil at an avemoe marker density of t)iia marker every 25 kb. 

A sixth embodiment of the present invenlion is a melhod of identifying one or mora biallslic markers associated 
with a detectable trait comprising the steps of dcterminingthc frequencies of each allele of one or more biallelic 
markers obtained by the method of tho first embodiment in individuals wha express said detectable trait and individuals 
who do not eipross said detectable trah. and Identifying one or more alleles of said one or mora biallelic markers which 
ara statistically associated with the expression of said dtilcctable trait In one aspect of this sixth embodiment, the 
detectable trait is selected from the oroup consisting of disease, drug response, drug efficacy, and drug toxicity. In 
anotfier aspect of this sixth embodiment the phGnotYpe of said individuals who express said detectable trait and the 
phenotype of said individuals who do not express said detcctalile trail are readily distinguishable from one another. In 
still another aspect of this sixth embodiment the individuals who cipress said detectable Trait and the individuals wiio do 
not express said dctectabla trait arc selected from a bimodal phenotype dislributicn. In still another aspect of this sixth 
embodiment the individuals who express said detectable trait are at one phcfiotypic extreme of the population and said 
individuals who da not Express said detectable trait arc at the other phanotypic extreme of the population. 

A seventh embodiment of the present invention is a melhod of identifying a haplotype associated with a trait 
comprising the steps of obtaining nucleic add samples from trah positive and trait negative individuals, determining 
the frequencies of the alleles of each member of a grcup of biallelic markers obtained by the melhod of the first 
embodiment which are known to ha located proximity to unc another in the genome in said nucleic acid samples, and 
identifying a plurality of alleles of biallelic markers having a statistically signrflcani association with said trait In on9 
aspect of this seventh embodiment the detectable trait is selected froni the group consisting of disease, drug [csponse, 
drug efficacy, and drug toxicity. 

In another aspect of this seventh embodiment the biallefic markers in each of these groups ara located within 
a genomic region spanning less than Ikb. In still another aspect of this seventh embodiment, the biallelic markers in each 
of these groups are located within a genomic region spanning from 1 to 5kb. In still another ospect of this seventh 
embodiment the MaHelic markers in each of these groups are located within a genomic region spanning from 5 to lOkb. . 
In stil another aspect of this seventh embodiment the blallGlic markers in each of these groups are located witiun a 
genomic region spanning from 10 to 25kb. . in still another aspect of this seventh embodiment, the biallelic markers in 
each of these groups are located within a genomic region spanning from 25 to 50kb. In still another aspect of this seventh 
embodiment the biallelic markers in each of these groups are located within a genomic region spanning from 50 to 
150kb, . In still another aspect of this seventh embodiment the biallelic markers in each of these groups are located 
within a genomic region spanning from 15D to 25Qkb. In still another aspect of this seventh embodiment the biallelic 
markers in each of these groups are located within a genomic region spanning from 250 to 500kb. In still another aspect 
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qf this seventh embodiment, the faialleiic markers in each ai these groups are located within a genomic region spanning 
from BOOkb to 1Mb. In slill another aspect of this seventh embodiment, the biallelic markers in each of these groups are 
located within a genomic region spanning more than IMh. 

An eighth embodiment of the present invention is a method of identifying one or more biaHelic markers 
associated with a detectable trait comprising the steps of seleciino a Qcna in which mutations result In a detcctahto trait 
or a gene suspected of being associated with a detectable trait and identifying one or more bislleiic markers obtainfid by 
the method of Claim 1 within the genomic region harboring said gena whidi are associated with said detectishln trait. In 
one aspect of this cioliUi embodiment, the delectable trait is setected from tlic gtoup consisting of disease, drug 
response, drug efficacy, and drug toxicity. In anothiir aspect of this eighth embotlimunt, the identifyiny step comprises 

determining the frequencies of said one or more biallelic markers in individuals who express said dt!tectable 
trait and individuals who da not express said detectable trait and identifying one or more biallelic markers which are 
statistically associated with the expression of said dotectablc trait. 

A ninth embodiment of the present invention is an array of nucleic acids fixed to a support, said nucleic acids 
comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, of one or more biallelic markers 
obtained by the method of The first embodiment* In one aspect of this ninth embodiment, the nucleic acids comprise a: 
least 15 consecuthfB nucloolides, including the polymorphic nucleotide, of at least five biallelic markers obtaimid Ly the 
method of the first embodiment in onother aspect of this ninth embodiments 

the nucleic acids comprise at least 8 consecutive nucleotides, including the polymorphic nucleotide, of at least ten 
biallelic markers obtained by the method of the first emhadiment. 

A tenth embodiment of the present invention is an array of nucleic acids fixed to a support, said nucleic acids 
comprising at least 8 consecutive nucleotides, including the poiymarphic nucleotide, of dne or more groups of biallelic 
markers known to be located in proximity to one another in the genome. 

An eleventh embodiment of the present invention is an array of nucleic acids fixed to a support, said nucleic 
acids comprising amplification primers for generating an amplification product comprising at least 8 consecutive 
nucleotides, Including the polymorphic nucleotide, of one or more bialieKc markers obtained by the method of the first 
embodiment 

A twelfth embodiment of the present invnetion Is an array of nucleic acids fixed to a support, said nucleic acids 
of comprising amplification primers for generating an amplification product comprising at least 15 consecutive 
nucJeotides, including the polymorphic nucleotide, of one or more groups of biallelic markers known to be located in 
proximity to one another in Ihe genome. 

A thirteenth embodiment of the present invnetion is an array of nucleic acids fixed to a support, said nucleic 
acids comprising one or mora microsequencing primers for determniing the identity of the polymorphic base of one or 
more nucleic acids comprising at least 15 consecutive nucleotides, including the polymorphic nucleotide, of one or more 
biallelic markers obtained hy the method of the first embodiment 
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A fourteenth embodiment of the present invention is an array of nucleic adds fixed to a support, said nucleic 
nucleic acids comprising une cr more microsuquGncing primers for determining the idcniity of the polymorphic bases of 
one or more groups of biallclic markers known to be located in proximity to one another in llic gDnomc. 

A fiftccnih embodiment of the present invention is an nrray of nucleic acids fixed (o a support, wherein said 
nucleic acids arc complemcntarY to one or more microsaquencing primers for determining liie identities oi the 
polymorphic bases of one cr more biallefic markers obtained by tho matfiod of the first embcinmenL In one nspoci of 
this fifteenth embadiment, the nucleic acids arc complementarY to at least five microsGquendng primers for determining 
the identities of the polymorphic bases of at least five hiaiielic markers obtained by tltc method of the first embodiment. 
In another aspect of this fifteenth cmbodimant the audeic acids are complemsntarY (o at least ten mrcrosuiiuoncing 
primers for CBterminlng the identities of ttm polymorpliit: bases of at least ten biallelic markers obtained by the inctliod 
of the first embodiment, 

A sixteenth embodiment of the present invention is an array of nudaic acids fixed la a support, said nudeic 
acids comprising one or more nucleic acids complcinentary to ona or more micrDseqiicncing primers for determining the 
identity nf the polymorphic bases of one or more groups of bialldic markers known to be located in proximity to oiiu 
another in the genome. 

Another aspect of tiie present invention is an array of any ono of the teritli, twelfth, fourteenth or sixteenth 
embodiments, wherein the members of each ol said one or more groups of bialltiHc markers are located in physical 
proximity to one another on said support , 

Another aspect of the present invention is an array of any one of Claims of the tenth, twelfth, founeenth or 
sixteenth embodiments, wherein said biallelic markers in each cf these groups arc located within a genomic region 
spannirjg less than Ikb. , " 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth or siilcenth 
cmbodimenlSr wherein said bialiefic markers in each of these groups arc located within a genomic region spanning from 1 
io5kb. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein the biallelic markers in each of these groups are located within a genomic region spanning from 5 
tolDkb. 

Another aspect of the present invention is an array cf any one of of the tenth, twelfth, fourteenth or sixtcentli 
embodiments, wherein the UaMc markers in each of these groups are located within a genomic region spanning from 
I0to25kb. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein the biallelic markers in each of these groups are located within a genomic region spanning from 
ZStoSOkh. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth or sbctcenth 
embodiments, wherein the biallelic markers in each of these groups are located within a genomic region spanning Irnm 
SOtolSOkb. 
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Another aspect of the prBScnt invention is 3n array of any one of of iho tenth, twelfth, fourteenth or sixteenth 
embodiments, whersin the hiallelic markers in each of these groups arc located within a genomic region spanning from 
150to250kb. 

Another aspect oi the present invention is an array of any ona of of the tenth, twelfth, fourteentli or sixtecnlli 
ombodiments, wherein the biaSclic markt^s in each of thijsa groups are located within a genomic reoion spanning from 
250 to 500kb. 

Another aspect of the priisont invention is an array of any one uf uf the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein the biallelic markers in nath of these groups arc located within a gcnumic region spanning from 
SOOkbtolMb. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth ur sixteenth 
embcdimBnts, wherein the bialMc markers in each of these groups are located within a genomic region spanning more 
than 1Mb. 

Another aspect of the present invention is an array of any one of of the tenth, twelftli, fourteenth or sixteenth 
embodiments, wherein each group of hiallelic markers comprisos at least 3 i:iallelic markers. 

Another aspect of the present invention is an array of any one of of the tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein each group of bioltcfic markers comprises nt least 6 biallelic markers, 

Anotficr aspect of the present invention is an array of ony one of of liie tenth, twelfth, fourteenth or sixteenth 
embodiments, wherein each group of biallelic markers comprises at least 20 biallelic markers. 

A seventeenth embodiment of the present invention is a method for determining whether an individual is at risk 
of developing a detectable trait or suffers from a dctQctabie trail associated with said trait comprising the steps of 
obtaining a nucleic acid sample from said individuai, screening said nucleic acid sample will) one or more biallelic markers 
obtained by the method of the first embodiment, and detem:\ining whether said nucisic acid sample contains one or morn 
of biallelic markers statistically associated with said dctGctablc traiL I one aspect of this seventeenth embodiment, the 
detectable trait is selected from the group consisting of disease, drug response, drug efficacy end drug toxicity. In 
another aspect of this seventeenth emobiment the biallelic markers were obtained by the method of the sixth 
embodiment In another aspect of this seventeenth embodiment, the biallelic markers were obtained by the method of 
the eighth embodiment 

An eighteenth embodiment of the present invention is a method of using a drag comprising obtaining a nucleic 
acid sample from an individual, determining the identity of the polymorphic base of one or more biallelic markers obtained 
by the method of the first embodiment which is associated with a positive response to treatment with said drug or one 
or mora biallelic markers obtained by the method of tite first embodiment which is associated with a negative response 
to treatment with said drug^ and administering said drug to said individual if said nucleic acid sample contains one or 
more biallelic markers associated with a positive response to treatment with said drug or If said nucleic acid sample 
lacks one or more biallelic markers associated with a negative response to said drug. In one aspect of this eighteenth 
embodiment, the determining step comprises determining the identity of the polymorphic base of one or more biallelic 
markers obtained by the method of the aspect of the sixth embodiment wherein the trait is drug response which is 
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associated with a posrtivD resppasc to trsatment with said drug or one or more bialleiic markers obtained by the aspect 
of the sirth embodiment wherein the trait is drug response whicli is associoled with a nBgative response to trcalmciU 
with said drug* another aspect of this eighteenth embodiment, the dctcrmming step comprises dsterminlng the 
identity of the polymorphic base of one or more bialleiic markers obtained by the aspect of the eiulith embodimunt 
wherein th« trait ts drug response which is associated with a positive response to trsatmenl with said drug or onn or 
mDce bialleiic markers obtained by tlie method of the aspect of the eighth embodimunt wherein the trait is drug response 
wliich is associated with a negative response to treatment with said drug* 

A ninoteenth embodiment of the present invention is a method of sclcctiriQ an individual for niclusion in a 
clinical trial of a drug comprising obtaining a nucleic acid spn^pie from an individual determining the identity of the 
polymorpliic base of one or more bialEolic markers nbtained by the method of the first embodiment which is associated 
with a positive response to treatment with said drug fir one or more biellelic markers associated wilii a negative 
response to treatment with said drug in said nucleic add sample, and Including said individual in said clinical trial if said 
nucleic acid sample contains one or more bialieSc markers obtained by the method of the first embodiment which is 
associated with a positive riisponse to treatment with sakl drug or if said nuclt^ic add sample lacks one or more bialleiic 
markers associated with a negative response to said drug. In one aspect of this nineteenth embodiment, the dutcrmining 
step comprises (iEtermining the identity of the polymorphic base of one or more bialleiic markers obtained by the aspect 
of the sixth cmbodinient wherein the trait Is drug response which is associated with a positive response to treatment 
whh said drug or one or more bialleiic markers obtained by the aspect of the sixth embodiment wherein the trait is drug 
respons which is associated with a negative response to treatment with said drug. In another aspect of this nineteenth 
embodimont the determining step comprises determining the identity of the polymorphic base of one or more bialleiic 
markers obtained fay tha aspect of thceighth embodiment wherein the trait is drug response v/hich is associated with 2 
positive response to treatment with said drug or one or more bialleiic markers obtained by the aspect of the eighth 
embodiment wherein the trat is dnig response which is associated with a negative response to treatment with said 
drug. 

A twentieth embodiment of the present invention is a method of identifying a gene associated v/ith a 
detectable trait comprising the steps of determining the frequency ol each allele of one or more bialleiic markers 
obtained by (he method of the first embodiment in individuals having said detectable trait and individuals lacking said 
detfictabia trait identifying one or more alleles of one or mom bialleiic markers having a statistically signrflcant 
association with said detectable trait, and identifying a gene in linkage disequilibrium with said one or mora alleles. 
In one aspect of this twentieth embodiment, the method further comprises identifying a mutation in the gene v;hich is 
associated with said detettabte trait. In another aspect of this twentieth embodiment, tha delectable trait is selected 
from the group consisting of disease, drug response, drug efficacy, and drug toxicity. 

A twenty-first ambodiment of the present invention is a method of identifying a gene associated with a 
detectable trait comprising selecting a gena suspected of being associated with a detectable trail and identifying 
one or more bialleiic markers obtained by tha method of the first embodiment within the genomic region harboring said 
gene wfhich are associated with said detectable trait. In one aspect of this twenty-first embodiment, the detectable trait 
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is selected from the group consisting of disease, drug response, drug efficacy, and drufl taiicity. In another aspect of 
this twenty.first embodiment, the identifying step comprises determining the frequencies of said one or mors biallclic 
markers in individuals who express said dciectablo trait ant! individuals who do not cipress said delcntahle trait and 

identifying one or more biallclic markers which arc statistically associated with tha expression of ziM 
detectable trait 

A twenty-second embodimuiil of the present invention is a method of identlfyino s haplotype associalod v/ilh 
a troit compiising the steps of obtoiniiig nucleic acid samples froiu trait positive aitd trait negative individuals, 

conducting an amplification reaction on said nucleic acid samples using amplification primers cnjinblo of 
generating amplification products containing the poiymorphic basas of a plurality of blallelic markers, cDiitacttng one or 
more arrays according to the tenth embodiment with said amplification products, determining the identitit;s of the 
polymorphic bases of said amplification products, and identifying a haplotype having a statistically significant 
association with said trait. 

A twenty-third embodiment of the present invention is a method of identifying a haplotype associated with a 
trait comprising the steps of obtaining nucleic acid samples from trail positive and trait negative individuals, conducting 
amplification reactions on said nucleic acid samples using amplification primers capable of generating am[j!ification 
products containing the polymorphic bases of a plurality of biallelic markers, comacting one or more arrays accordimj to 
the fourteenth embodiment with said amplification products, conducting microsequancing reactions on said 
amplification products using microscqucncing primers on said arrays, thereby gcncratittg elotigatcd microsequencino 
primers comprising the polymorphic bases of said amplification products, determining the identities of said polymorphic 
bases, and identifying a haplotype having a statistically significant association with said trait. 

A twenty-fourth embodiment of the present invention is a method of identifying a haplotype associated with a 
trait comprising the steps of obtaining nucleic acid samples from trait positiva and trait negative individuals, conducting 
amplification reactions on said nucleic acid samples uisng amplification primers which are capable of generating 
ampRfiCaiion products containing the polymorphic bases of a plurality of biallelic markers, conducting micrDsequencing 
reactions on said nucleic acid samples, thereby generating microsequencing products containing the polymorphic bases 
of one or more biallelic markers at their T ends, said polymorphic bases being detectably labeled, contacimg one or more 
arrays according to the sixteenth embodiment with said microsequencing products such that said microsequencing 
products specifically hybridize to said nucleic acids complementary to said microsequencing primers, determining 
the identities of tha polymorphic bases of said microsequancing products, and identifying a haplotype having a 
statistically significant association with said trait, 

A twenty-fifth embodiment of the present invention is a method of identifying a haplotype associated with a 
trait comprising tha steps of obtmning nucleic arid samples from trait positive and trait negative individuals, contacting 
one or more arrays according to the twelfth embodiment with said nucleic add sample, conducting an amplification 
reaction on said nucleic acid samples using amplification primers on said array which are capable of generating 
amplificatiDn products containing the polymorphic bases of a plurality of biallenc markers, determining the identities of 
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the polymorphic bases of said amplification products, and identifying a haplotype having a statistically siynificant 
association with said trait. 

A twenty-sixth Bmbodiment of the present invention is a method of dctGmiining whether an individual is aJ risk 
of developing Alzheimer's disease or whether the individual suffers from Alzheimer's disease ss a result of possessing 
the Apo E €4 Site A allele comprising obtaining a nucleic add sample from said individunl, and determimng the identity 
of the polymerphic base in one or more of the sequences selected from the group consisting of SEQ ID Nos. 301-305 and 
SEQIO Nos- 307-31 1 or the sequences complementary thereto in said nucleic acid sample. In one aspect of this twenty- 
sixth embodiment, the muthod further comprises determining whether said nucleic acid sample contains the snqaence of 
SEQ ID No, 3DG or the sequence complemenlary thereto. In another aspect of this iwenty-sinh embodinient, the step of 
deternfyning the identity of the polymorphic boses in one or more of the sequences selected from the group consisting of 
SEQ m Nos. 301-305 and SEQ ID Nos. 307-311 m the sequences complementary thereto comprises determining 
whether said nucleic acid sample contains the sequence of SEQ ID NO, 311 (the T allele of marker 39-3G5i3441 or the 
sequence complementary thereto. In anotlier version of the precetJing aspects the further comprises detcrmininy whether 
said niJclGtc acid sample contains the sequence of SEQ ID No. 30G or the sequence complementary thereto, 

A twenty-seventh embodiment of the present invention is an isolated nucleic add comprisiriQ a sequence 
selected from the group consisting of SEQ ID No. 301, SEQ ID No. 307, the sequences complementary thereto, and 
fragments comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, thereof, 

A twenty-eighth embodiment of the present invention is an isolated nucleic acid comprbing a sequence 
selected from the group consisting of SEQ 10 No. 302 , SEQ 10 No. 308, the sequences complementary thereto, and 
fragments comprising at least 8 consecutive nucleotides thereof. 

A twenty-ninth embodiment of the prasent invention is an isolated nucleic acid comprising a sequence selected 
from the group consisting of SEQ ID No. 303, SEQ ID No. 309, the sequences complementary thereto, and fragments 
comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, thereof, 

A thirtieth embodiment of the present invention is an isolated nucleic acid comprising a sequence selected from 
the group consisting of SEQ ID No. 304, SEQ 10 No. 310 , the sequences complementary thereto, and fragments 
comprising at least S consecutive nucleotides, including the polymorphic nucleotide, thereof, 

A tiiirty first embodiment of the present invention is an isolated nucleic acid comprising a sequence selected 
from the tjroup consisting of SEQ ID No. 305, SEQ ID No, 311, the sequences complementary thereto, and fragments 
comprising at least fl consecutive nuclootides, includinfl the polymorphic nucleotide, thereol 

A thirty second embodiment of the present invention is an isolated nucleic acid comprising a sequence selected 
from the group consisting of SEQ ID Nos. 313-317, SEQ ID Nos. 319-323, and fragments comprising at least 8 
consecutive nucleotides thereof, 

A thirty third embodiment of the present invention is isolated nucleic acid comprising a sequence selected from 
the group consisting of SEQ ID Nos, 325-329, SEQ ID Ncs. 331-335, the sequence complementary thereto, and 
fragments comprising at least 8 consscutive nucleotides thereof. 
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A thirty fourth embodiment &{ the present inventioti is set of nuclciD acids comprising at least 9 consecutive 
nucleotides, induding the polymorphic nucteotidB, of ons or more biaOelic markers obtained by the method of the first 
embodiment. 

A ihirty fifth embadimcnl of the present invention is a set of nucleic acids comprisino amplification [irirncrs for 
generating sn amplincation protluct comprising at least 8 consecutive nuclcutiiies, including the polymorphic midcotidD, 
of one or more biallelic markers obtained by the method of the first embodiment 

A thirty sixth embodinient of the present invention is a sel of nucleic ucitls comprisino one or more 
microsequcncing primers for dulunnining the identity of the polymorplilc basfl of one or mure nucleic acids cnniprisinn at 
luast 8 consecutive nucleotides, including the polymorphic nucleotide, of one or more binllelic markers obtoinetJ by the 
method of the first embodiment. 

Brief Descrintion nf the Drawings 
figure 1 is a cytogenetic map of chromosome 21. 

Figure 2a shows the results of a computer simulalinn of the distribution of inter-myrker spacing on a randomly 
distributed set of bialtelic markers indicating the psrcentDge of biallelic markers which will be spaced a given distance 
apart for 1, 2, or 3 markcrs/SAC in a genomic map {assuming a set of ZOrOCD minimally overlapping BACs covering the 
genome are evaluated). 

Figure 2b shows the results of a computer simulation of the distribution cf intcr-marker spacing on a randomly 
distributed set of biallelic markers indicating the percentage of bialleiic markers which will ba spaced a given distance 
apart for 1, 3, or 8 markcrs/BAC in a genomic map {assuming a set of 20,000 minimally overlapping BACs coverinn the 
Ocnemo are evaluated), * " 

Figure 3 shows, for a scries of hypothetical sample sizes, the p-value significance obtained in association 
studies performed using individual markers from the high-density biallelic map, according to various hypotheses regarding 
the difference of allelic frequencies between the and T- samplcs. 

Figme 4 is a hypothetical association analysis conducted with a map comprising about 3,000 biallelic markers. 

Figure 5 is a hypothetical asseciation analysis conducted with a map comprising about 2C,Q00 biallelic 

markers. 

Figuro 6 is a hypothetical association analysis conducted with a map comprising aboirt 60,000 biallolic 

markers. 

Figure 7 is a haplotype analysis using biallelic markers in the Apo E region. 

Figure 8 is a simulated haplotype analysis using the biallelic markers in tlie Apo E region included in the 
haplotype analysis of Rguro?, 

Figure 9 shows a minimal array cf overlapping clones which was chosan for further studies of bialleiic markers 
associated with prostate cancer, the positions of STS markers known to map in the candidate genomic region along the 
cantig, and the locations of biallelic markers along the BAC tonlig harboring a genomic region harboring a candidate gens 
associated with prostate cancer which were identified using the methods of the present invention. 
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Figure 10 is a rough localization of a candidate gene for prostate cancer which was obtained by delermining 
the frequDncies of the biallellc marksrs of Figure 9 in affocted and unaffected populations. 

Figure 1 1 is a further refinamcnt of the localizalion of the candidate ocnc for prostate cancer usinij inlilitional 
biallclic markers which were not include J in tha m^h lucaiization itliistrated in Rgurc 10. 

Rgurc 12 is a haplalype analysis using the bialltiHc markers in iha genomic rugion of the gnim associotuti with 
prostate cancer* 

Figure 13 is a simulated liaplctypc using the six markers included in haplotypa 5 of Figure 12, 

Detnilod Descrintinn nf tha Preferred Finhndiment 
The human haploid genome contains an estimated BO.DOD to 100,000 or more genes scattered on a 
3 X 10^ base-long double stranded DNA shared among the 24 ciuomosome^. Eacii liuman being is diploid, ia. possesses 
two haploid genomes, one from paternal origin, tlie other from maiGrnal origin. The sequence of the human genome 

7 9 

varies among individuals in a population. About 10 sites scattered along the 3x10 base pairs of DNA are polymorphic, 
existing in at least two variant forms called alleles. Most of these pciymorphic sitas arc generated by singfe base 
substitution mutations and are hiaMc. Loss than 10^ polymorphic sites are due to more complex changes and are VBry 
often multi-oMc i.c exist in more than two allelic forms. At a given polymoTphic site, any individual (diploid), can Lg 
either homozygous (tv/icc the same allele) or heterozygous (two different alleles). A given polymorphism or rare mutation 
can be either neutral (no effect on trait), or functional responsible for a particufar genetic trait. 

Genntic Mans 

The first step towanJs the identification of genes assBciatcd with a detectable trait, such as a disease or any 
other detectable trait, consists in the localization of genomic regions containing iralt-causing genes using genetic 
mapping methods. The preferred traits contemplated within the present invention relate to fields of therapeutic interest; 
in particular embodiments, they wilt be disease traits and/or drug response traits, reflecting drug efficacy or toxicity, 
Traits can either be "binary". a,g, diabetic vs, non diabetic, or "quantitative", eg. elevated blood pressure. Individuals 
affected by a quantitative trait can be classified according to an appropriate scale of trait values, e.g. blood pressure 
ranges. Each trait value range tan than be analyzed as a binary trait. Patients showing a trait value within one such 
range will be studied in companson with patients showing a trait value outside of this range. In such a case, genetic 
analysis methods will be applied to sufapopulations of individuals showing trait values within defined ranges. 

Genetic mapping involves the analysis of the segregation of polymoiphic loci in trait 
positive and trait negative populations. Polymorphic loci constitute a small fraction of the human 
genome (less than 1%), compared to the vast majority of human genomic DNA which is identical in 
sequence among the chromosomes of different individuals. Among all existing human polymorphic 
loci, genetic markers can be defined as genome-derived polynucleotides ^vhich are sufficiently 
polymorphic to allow a reasonable probability tliat a randomly selected person will be heterozygous, 
and thus informative for genetic analysis by methods such as linkage analysis or association studies. 
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A genetic map consists of a collection of polymorphic markers wiiich have been positioned on ths human 
chromosomes. Genetic maps may be combined with physical maps, coilections of ordered ovGilapping fragments of 
genomic DNA whose arrangement along the human chromosomes is known. The optimal genetic mnp should possess 
tlui following characteristics: 

- the density of the genetic markers scattered along the flcnonic should bu sufficisnt to allow the identilicntion and 
localization of any trait-mlatcd polymorpiiism. 

- each marker should have an adequate level of heterozygosity, so as la be informative in a large percentage of different 
nieioses, 

- all markers should be easily typed on a routine basis, at a reasonable expense, and in a rcasonafale amount of time, 
* the entire set of markers per nhromosomc should bo ordered in a highly reliable lasiiian. 

However, while the above maps are optimal, it will be appreciated that the maps of Xh^ present invention may 
be used in the tiie individual marker and haplotype association analyses dcscribod below without the nccosslty of 
determining the order of biallelic markers derived from a single BAG with respect to one another. 

Genetic IVlaDS Based on RFlPs or VNTRs 
Tim analysis of DNA polymorphisms has relied on the ioliowlng types of polymorphisms. The first generation 
of genetic markers were resiriction fragment IcnQth polymorphisms (RFLPsL sinijlc nucleotide polymorphisms which 
occur at restriction sites, thereby modifying the cleavaoe pattern of the corresponding restriction enzyme. Though Ific 
original methods used to type RFLPs were material-, effort- anii time-consuming, today these markers tan easily be 
typed by PCR^based technologies. Since they are biallelic markers (they present only two alleles, the restriction site 
being either present or absent), their maximum heterezvoesity is 0.5, The theoretical number of RFLPs distributed along 
the entire human genome is more than 10^ , which leads to a potential average intgr-raarker distance of 30 kilobases. 
However, in reality the number of evenly distributed RFLPs which occur at a sufficient frequency in the population to 
make ihem useful for tracking of genetic polymorphisms is very limited. 

The second generation of genetic markers was VfJTRs (Variable Number of Tandem Repeats), which can bn 
categorized as either minisatelHtcs or microsatellites. Mintsatellites are tandemly repeated DNA sequences present in 
mils of 5-50 repeats which are distributed along regions of the human chromosomes ranging from Q.l lo 20 kilobases in 
length. Since they present many possible alleleSi their polymorphic informative content is very high, Minisatellites are 
scored by performing Southern blots to identify the number of tandem repeats present in a nucleic acid sample from the 
individual being tested. However, there are only 10^ potential VNTBs that can be typed by Southern blotting. 

Microsatellites (also called simple tandem repeat polymorphisms, or simple sequence length polymorphisms) 
constitute the most developed category of genetic markers. They include small arrays of tandem repeats of simple 
sequences (di-trKetra- nucleotide repeats) which exhibit a high degree of length polymorphism and thus a high level of 
mformativencss. Slightly more than 5,000 microsatellites easily typed by PCR-derived technologies, have been ordered 
along the human genome {Dib ei al., NaturB 330:152 (1996). the disclosure of which is incorporated herein by 
reference). 
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A numher of thesa availafalo ratcrosatellitcs were used to construct integrated physical and genetic maps 
containing less than 5,0QD marker^. For example, CEPH (Chumakov el al.. Nature 377: 1 75-298 {1995] and Cohan et al., 
Nntiire 365: 6D11-701 (1903) , The disclosures of which arc incorporated herein by reference), and Whitehead Institute 
3nd Gdnithon (Hudson el a!., 1935), constructed gcnclic and physical maps covering 75% to 95% of the human gennme, 
based on 2500 to 5000 micTDsatolEta markers. 

However, the number of easily typed inforniativc markers in thusu maps was tuo si«all for the ;jvcrag9 
distunca between informative markers to fulfill tfic above-listed reijuireni^ents for oenucic maps. 

Ciallelic Mnrkm 

Bialiclic markers are gcnonm-dcrived polymiduulides which cihiliii hiallelic polymorphism. As used herein, thi! 
term bialiclic marker means a bialiclic single nucleotide polymorphism. As used herein, the term polymorphism may 
include a single base substitution, insertion, or deletion. 5y dufinition, lha lowest allele frequency of a bialleliC 
poiymorphism is 1% (sequence variants which show allele frequencies hulcw 1% are called rare mutatiuas}. There are 
poxenlially more than 10 bialieiin markers which can easily be typed by ruuiinc automated techniques, such as 
sequence- nr hybridization-based techniques, out of which 10** are sufficiently informative for mapping purpnsus. 
However, a biallelic marker will sliow a sufficient dtigree of informativencss for usd m genetic mapping only if tlte 
frequency of its loss frequent allule is not less than about 10% (La. a heterozygosity rale of at least 0,18} (the 
heterozygosity rate for a biallefic marker is 2 P, (1-Ppl , where is the frequency of allele a). Preferably, the frcquoncy 
of the less frequent allele of the bialiclic markers in the present maps is at least 20% (i.e. a heterozygosity rote of at 
least D.32). More preferably, the frequency of tfic less frequent allele of the bialiclic markers in the present maps is at 
least 30% (i.c. its heterozygoshy rate is higher than about 0.42). 

Initial attempts to construct genetic maps based on non-RFLP biallelic markers have focused on identifyiny 
biallelic markers lying within sequence tagged sites (STS), pieces of genomic ONA having a known sequence and 
averaging about 250 bases in length. More than 30,000 STSs have been identifisd and ordered along the onnome 
(Hudson et al., Smm 270:1345-1954 (19951; Schuler et aU Science 274:540-548 (1996), the disclosures of wiilch 
are incorporated herein by reference). For example, the Whitehead Institute and Genethon's Integrated map contains 
15,086 STSs. 

These sequence tagged sites can be screened to identify polymorphisms, preferably Single Nucleotide 
Polymorphisms (SNPs), more preferably non RFLP biaMc markers therein. Generally polymorphisms are identified by 
determining the sequence of the STSs in 5 to 10 individuals, 

V/ang et al. {Cold Spring harbor laboratory: Abstracts of papors prcssented on genome Mapping and 
sequsncmg^M (May 14-18, 19971^ the disclosure of which is incorporated herein by reference) recently announced the 
identrficatioft and mapping of 75D Single Nucleotide Polymorphisms issued from the sequencing of 12,000 STSs from 
the Whitehead/MIT map, 5n eight unrelated individuals. The map was assembled using a high throughput system based 
on the utilization of ONA chip technology available from Affymetrix (Chee et al. Science 274:610-614 11996), the 
disclosure of which is incorporated herein by reference)* 



wo 99/04038 



PCT/IB98/01193 



-15- 

Hawever, according to experimental data end statistical calculations, loss tttan one cut of 10 o[ ail STSs 
mapped today will contain an jnformativo Slnole Nucleotide Polymorphism. Tliis ia primarily due to the short length of 
existing STSs {usually less than 250 bp). If one assumes ID^ informative SNPs spread along the human genome, there 
would cn avcrap be one marker of interest every 3X1 0^/1 i.e. every 3,000 hp. The probability (hat one such marker 
5 is present on a 2B0 bp strutch is thus less than lil 0, 

Wlulo it coufj produce a high dunsity map, tltc STS opprondi based on currently existing markers docs not put 
any systematic effort into making sure that the markers obtained are optimally distributed ihrnughcu: the entire 
genome. Instead, polymorphisms are limited to thoso locations fur which STSs are available. 

The even distribution of markers along the chromosomes is critical to the future sucaiss of genetic annlysos. 
ID In particular, a high density map having appropriately spaced markers is essential for conducting association studies on 

sporadic cases, aiming at identifying genes responsible for detectable traits such as those which are described below. 

As will be further axplained below, genetic studies have mostly relied in the past on a statistical approach 
called linkags snafysis, which took advantage of microsatcliite markers to study their inlicrrtancc pattern within families 
from which a sufficient number of individuals presented the studied tiait. Because of intrinsic limitations of linkage 
15 analysis, which will be further delailed below, and hscausa these sturfies necessitate the recruitmant of adequntu family 

pedigrees, they are not well mxzi to the genetic analysis of all traits, panicularly these for which only sporadic cases 
are available (ag. drug response traits), or thosa which have a low penctiance within the studied population. 

Association studies offer an allemaiive to linkage analysis. Combined with the use of a high density map of 
appropriately spaced, sufficiently informative markers, association studies, including linkage disequiiibrium-bnsed 
20 genome wide essociation studies,will anablc the identification of most genes involved in complex trarts. 

The present invention relates to a method for generating a high density linkage disequilibrium-based genetic 
map of the human genome which will allow the identification of sufficiently infonmative markers spaced at intervals 
which permit their use in identifying genes rcspcnsiblc for detectable traits using genome-wide association studies and 
linkage disequilibrium mapping. 
25 Construction ef a Physical Mao 

The first step in constructing a high density genetic map of bialielic markers is the construction of a physical 
map. Physical maps consist of ordered, overlapping cloned fragments cf genomic DNA covering a portion of the gename, 
preferably covering one or all chromosomes. Obtaining a physical map of the flenome entails constructing and ordering a 
gennmic DNA Rbrary, 

30 Physical mapping in complex genomes such as the human genome (3.000 Megabases) requirBS the construction 

of DNA libraries contaimng large inserts (on the order of 0,1 to 1 Megabass). It is crucial that such libraries bo easy to 
construct, screen and manipulate, and that the DNA inserts be stable and relatively free of chimerism. 

Yeast artificial chromosomes (YACs; Burke et aU Scfence 23B:B06'812 I19B71, the disclosure of which is 
incorporeted herein by rderenca) have provided an invaluabia tool in the analysis of complex genomes since their daning 

35 capacity is extremely high fin the Mb range). YAC libraries containing large DNA insens (up to 2 Mb) have been used to 

generate STS-conlent maps of individual chrcmosomes or of the entire human genome (Chumakov et a!. {1995L supra: 
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Hudson Et aL (1995), supr^'. Colien et al. Nature 356: 698-701 (1993; Chufnakov el al. Ndtm 359:380-367 (1392); 
Gemmill et sL, Natun: 377:299*319 (1995); Doggctt ct aU Nature 377:335-365 {19351; the disclosures of wlircli are 
mcorporatE?d herein by referencu). 

Th8 prasunt genetic maps may be constructed using currently available YAC genomic libraries such os tlic 
CEPH human YAC library as a starting materiaL {Chumakov et aL (1995), suprs). Alternalively, one m.iy cnnstruct a 
YAC genomic library as doscribad in Ciiumakov ct aU 1995, the disclosure of which is incorporated hcruiji by raferuncc, 
or as described below* 

Once 3 YAC genomic library has been obtained, the sjcnomic DKA fragments therein are ordered. Ordering may 
be porformcJ directly on the gunamic DNA in the YAC library. However, direct ordering of YAC inserts is not preferred 
because YAC libraries often exiribit a high rate tif chimsrism (40 to 50% of YAC clniies contain fragments fruni more 
than one genomic region), often sirffor from clonal instability within their genomic DNA inserts, and require tedious 
procedufBS to manipulate and isolate tho insert DNA. Instaad. it is preferable to conduct the mapping and sequencing 
procedures required fnr ordering the genomic DNA in a system which enables the stable cloning of large inserts while 
being easy to manipulate using standard moiecular biology tecimiques. 

Accordingly, it is preferable to clone tho genomic DNA snto bacierial single copy plosmids, for example BACs 
(Bacterial Artificial Chromosomss), rather than into YACs* Bacterial artificial chrDmosomcs arc well suited for use in 
ordering genomic DNA fragments. BACs provide a low rata of chimcrism and Iraumcnt rearrangement, together with 
relative case of insert rsolation. Thus BAC libraries are well suited to integrate genetic, STS and cytogenetic 
information while providing direct access to stable, rcadily-sequencesble gsnomic DNA. An example of bacterial artificial 
chromosome is the BAC cloning system of Shizuya et aU which is capable of siabty propagating and maintaining 
relatively large genomic DNA fragments (up to 2QQ kb long) as single-copy plasmids ia Exall (Shizuya et ai., Proc. Natl. 
Acad, Sa\ USA 89:8734-8797 (1992), the disclosure of which is incorporated herein by reiercnce), 

Example 1 describes the construction of a BAC library containing human genomic DNA. It will be apprccinted 
that the source of the genomic DNA, the enzymes used to digest tlic DNA, the vectors into which the genomic DNA is 
inserted, and the size of the DNA inserts which are cloned into said vectors need not be identical to those described in 
Example 1 below. Bather, the genomic DNA may be obtained from any appropriate source, may be digested with any 
appropriate enzyme, and may be cloned into any suitable vector. Insert size may vary within any range compatible with 
the doning system chosen and with the intended purpose of the library being construclcd. Typically, using BAC vectors 
to construct DNA libraries covering the entire human genome, insert size may vary between 50kb and 300 kb, preferably 
lODkb andZOOkb, 

Example 1 
Construction of a BAC library 
Three different human genomic DNA libraries were produced by cloning partially digested DNA from a human 
lymphahlastoid ceil line (derived from individual 8445, CEPH famiPes) into the pBeloBACn vector (Kim et a!., 
Genomes 34:213-218 1199B), the disclosure of which is incorporated herein by reJerence), One library was produced 
using a BamHi partial digestion of the genomic DNA from the lymphoblastoid ceil tine and contains 110,000 clones 
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having an averaQe insen size af 150 kb (cDrresponding to 5 human haploid genome equlvaienls), Anotlicr iifarary was 
prepared from a Hindlll partial digest and corresponds to 3 human genome equivalents with an average insert sue of 
150kb. A ihird lihrnry was prepared from o Ndcl partial digest and corresponds to 4 human gcneme equivalents witii an 
average insort size of 1 SOkb. 

Altcrnativsly, the ocnon;iic DNA may be inscrtcti into BAC vectors which possess hnlh a high copy nmuber 
origin of mplicatifln, whicli facilitates the isolation of the vuctor ONA, and □ low copy number origin cf replication. 
Cloning of a genomic DNA Insert into the high copy number origin of replication inactivates the origin such that clones 
containing a genomic insert replicate at low copy number. The low copy number of clones having a genomic insert 
therein permits the inserts to be stably marntMnctl. In addition, sulnction procedures may be designed which enable low 
copy number plasmlds fi,e, vectors having genomic Inserts therein) to be selected. Such vectors and selection procedures 
arc described in the U.S. Patent Applicalion entitled 'High Tliroughput DNA Sequencing Vector' (GENSET.015A. Serial 
No. 03/050,7461, the disclosure of which is incorporated herein by reference. 

U will be appreciated tfiat the present melliods may be practiced using BAC vectors other than those of 
Shizuya et al. (1992, !^upra\ or derived from those, or vectors other than SAC vectors which possess the above- 
described characteristics. 

To construct a physical map of the genome from genomic ONA libraries, the library clones have to be ordered 
along the human chromosomes. In o prcfcn^ed embodiment, a minimal subset of the ordered clones will then be chosen 
that completely covers the entira genome. 

For example the genomic ONA in the inserts cf the above described BAC vectors are ordered using STS markers whose 
positions relative to one another and locations olorg the genome arc known using procedures such as those described 
herein. The STS markers used to order the BAC inserts may be tlie STS markers-contained in the integrated maps 
described above. Alternatively, the STSs may be STSs which are not contained in any of the physical maps described 
above. In another embodiment the STSs may be a combination of STSs included in the physical maps described above 
and STSs v^hich are not included in the integrated maps described above. 

The BAC vectors are screened with STSs until there is at least one positive BAC clone per STS. Preferably, a 
minimally overlapping set of 10,000 to 30,000 BACs having genomic inserts spanning the entire human genome are 
identified. More preferably, a minimaBy overlapping set of 10,000 to 30,000 BACs having genomic inserts of about 100- 
3Q0kb in length spanning the entire human genome are identified. In a preferred embodlmont, a minimally overlapping set 
of 10,000 to 30,000 BACs having genomic Inserts cf about 100-150 kfa in length spanning ths entire human genome is 
identified. In a highly preferred embodiment, a minimally overlapping set of 1S,000 to 25,000 BACs having genomic 
inserts of about 1DQ-200 kh In length spanning the entire human genome is identified. Alternatively, 3 smaller number of 
BACs spanning a set of chromasomes, a single chromosome, a particular subchromosomal region, or any other desired 
portion cf the genome may be Drdered. The BACs may be screened for the presence of STSs as described in Example 2 
below. 
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Exnrnoia 2 

Ordennff of a BAH Library: Saftenina Clnnns with STgs 
The BAC library is scrBsncd with a set of PCn^typcalile STSs la iilontify clones containinn tha STSs. To 
facilitate PCR screening of several thousand clonus, for example 200,000 clonus, pools of clones are prepared. 

Three-dimensional pools of the BAC Itoriiis aia preparctJ as described iri Chumakov ct oL nrid are scrconiid lor 
the ability to generate an amplification fragment in ampSfication rcactinns conducted using primers derived from the 
ordered STSs. (Chumakov ct aL (1095}, supni)- A DAC librory typically contnins 200,000 BAC clones. Since the yversgs 
size of each insert is 100-300 kb, the overall sixc gf 5«ch a library i5 equivalent *to the size of at least obont 7 human 
genomes. This library is stored as an array of individual clones in 518 384 well plates. It can be divided into 7 A primary 
pools (7 plates each). Each primary pool can then be divided into 48 subpoals prepared by using a thrce-dimensionnl 
pculing system based on the plote, row and columnr address of eacti clone (more particularly, 7 subpools consisting of all 
clones residing in a given microtitcr piate; 15 subpools ccnsistino of all clonos in a given row; 24 subpools consisting of 
all clones in a given column]. 

Amplification reactions arc conducted on the pooled BAC clones usiny primers specific for the STSs. For 
example, the three dimensional pools may be screened with 45,000 STSs whose positions relative to one another and 
locations along the genome are known. Preferably, the three dimensional pools are screened with about 30,000 STSs 
whose positions relative to one (mother and locations along the genome are known. In a highly preferred embodiment, 
the three dimensional pools are screened with about 20,000 STSs whose positions relative to one another and locations 
along the genome arc known. 

Amplification products resulting from the amplification rcnctiuns arc detected by convcntionDi aunroso gel 
electrophoresis combined with automatic image capturing and processing. PCH screianing for a STS involves three 
steps; (1) identifying the positive primary pools; i2) lor each positive primary pool, identifying tiic positive plate, row and 
column 'subpools' to obtain the address of the positive clone; (3) directly confirming the PCR assay on the identified 
clone. PCR assays are performed with primers specifically defining the STS. 

Screening is conducted as follows. Rrst BAC CNA containing the genomic inserts is prepared as follows. 
Bacteria containing the BACs are grown overnight at 37°C in 120 yj| of LB containing chioramphcnicol (1 2 juqM, UM 
is extracted by the following protocol: 

Centrifuge 10 min at 4*C and 20DO rpm 

Eliminate supernatant and r&suspend pellet in 120 fj] TE 10-2 [Tris HCl 10 mM, EDTA 2 mMl 
Centrifuge 10 min at 4**C and 2000 rpm 

Eliminate supernatant and incubate pellet with 20 //I iyzozyme 1 mg/ml during 1 5 min at room temperature 
Add 20 //I proteinase K lOOA/g/ml and incubate 15 min at 60° C 
Add 8 //! ONAse lUipl and incubata 1 hr at room temperature 
Add 100 u\ TE 10-2 and keep at -BO**C 
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PCR assays are performed using the following protocol: 

Final volume ''5/^1 

BACDMA 1.7 ngf;;! 

5 dNTP{c3ch} 200 //M 

primer (each) 2.9 ng//;! 

Ampit Taq Gold DNA polymerasa 0,05 unit///! 
PCR buffer llOx-. 0.1 MTiisllClpHB.3 0.5MKCI Ix 



10 Tim amplification is performed on a Genius II thermocycler. After heating at 95^C for 10 inirt, 40 cycles are psrforiTicd. 

Each cycle comprisoa: 30 sec at Bb^C, 54"C for 1 min, and 30 sec at 72%. Fur final elongation, 10 mia at 72*C end 

tha amplification. PCR products arc analyzed on 1% ayarose uul with 0,1 mg/ml flthtdium bromide. 

AlternativBly, a YAC (Yeast Artificial Chromosome) library cart be used. Tha very large insert size, of the order 

of 1 megabasc, is the main advantase of the YAC libraries. The library can typically include about 33,000 YAC clones as 
15 described in Chumakov et aL {1995, supml Tha YAC screening protocol may be the same as lha one used for BAG 

screemng. 

The known ordar of the STSs is then used to align tha BAC inserts in on ordered array (contig) spanning the 
whole human genome. If necessary new STSs to be tested can be generated by sequencing the ends of selected BK 
inserts. Subchromosonval localization of the BACs can be established and/cr verified by fluofoscence in situ hybridization 

20 (FISHL performed on metaphasic chromosnmes as described by Chenf et aL 1990 and in Eromple 0 below. BAC insen 

size may be determined by Pulsed Field Gel Eiectrophoresis after digestion with tha restriction enzyme Notl. 

Rnallyr a minimally overlapping set of BAC clones, with known insert size and subchromosomal iQCDtion, 
covering the entire genome, a set of chromosomes, a single chromosomfi, a particular subchromosomal region, or any 
otiuir desired ponion of the genome is selected from the DNA fibrary. For example. t3ie BAC dones may cover at IcDst 

25 lOOkb of contiguous genomic DNA, at least 250kb of contiguous genomic DNA, at least 5DDkb of contiguous genomic 
ONA, at least 2f^b of contiguoos genomic DNA, at least 5Mb of contiguous genomic DNA, at least 10Mb of contiguous 
genomic DNA, or at least 20Mb of contiguous genomic DNA, 

Identification of btatlelic markers 

30 In order to generate polymorphisms having the adequate informative content to be used as bialielic markers for 

genetic mapping, the sequences of random genomic fragments from an appropriate number of unrelated individuals are 
compared. Genomic sequences to be snreensd for bialielic markers may be generated by partially sequencing BAC 
inserts, preferably hy sequencing the ends of BAC subclones. Sequencing tlie ends of an adequate number of BAC 
subclones derived from a miriimally overlapping array of BACs such as those described above will allow the generation of 

35 bialieltc markers spanning the entire genome, a sat of chromosomes, a single chromosome, a particular subchromosomal 
region, or any other desired portion of the genoms with an optimized inter-marker spacing. 
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Thus, portions of ihe BACs in tha selected ordered array are then subclcned and sequenced using, for example, 
the praceduras described below. 

Example J 
S»bf:Ionina of BAHs 

5 The cells obtained from three liters ovornioht culture of each BAC clone are treated by alkaline lysis using 

conventional tcdmiques to obtain th8 BAC DNA containing the genomic DNA liiscns. After ccntrifugation of thtJ BAC 
DNA in a cesium chloride gradient, ca OOjug of BAC DNA ore purified. 5-lOjug uf BAC DNA arc sonicated using three 
distinct conditions, to obtain fragments within 3 desired siza range. Tiic obtained DNA fraomcnls ara end-repaired in a 
50 jj\ volume with two units of Vent jmlymerasB for 20 min at 70''C, in liic prcsencQ of the four duoxytriphosphotes 

10 1100/yM), The resulting blunt-ended fragments arc separated by electroplioresis on preparative low-mclting point 1% 

agarosG gels (SO Volts for 3 hours). The fragments lying within a desired size range, such as 600 to 6<000 bp, are 
excised from the gd and treated with agarasc. After chloroform extraction and dialysis on Microcon 100 columns, CNA 
in solution is adjusted to a 100 ng/pl concentration. A ligation to a linearised, dephosphorylated, blunl-ended plasmid 
cloning vector is performed ovcmiohthy adding 100 ng of BAC fragmented DNA to 20 ng of pBluescript II Sk (+) vector 

15 DNA linearized by enzymatic digestion, and treating with alScaline pliosphatase. The ligotion reaction is performed in a 

1 0 //I final volume in ttte presence of 40 units//il T4 DNA ligase (Epicentre). The iigated products arc clectroporatcd into 
the appropriate cells (ElcctroMAX £c^i*DH10B cells). IPTG and X-gal are added to llie coll mixture, which is then 
spread on llm surface of an ampicilfin-containing agar plate. After overnight incubation at 37''C, recombinant (white) 
colonies are randomly picked and arrayed in 96 well microplatcs for storage; and sequencing. 

20 Alternatively, BAC subcloning may be performed using vectors which possess both a high copy number origin 

of replication, which facilitates tha isolation of the vector DNA, and a low copy number" origin of replication. Cloning of 
a genomic DNA fragment into the high copy number origin of rspOcation inactivates the origin such that clones 
containing a genomic insert replicate at low copy number. The low copy number of clones having a genomic insert 
therein permits the inserts to be stably maintained. In addition, selection praccduros may be designed which enable iow 

25 copy number plasmids (Le. vectors having genomic inserts therein) to be selected. In a preferred embodiment, BAC 

subcloning win be perfonned in vectors having tha above described features and moreover enabling high throughput 
sequencing of long fragmeiTts of genomic DNA. Such high throughput high quafrty sequencing may be obtained after 
generating successive deletions within the subcloncd fragments to be sequenced, using transposition-bassd or enzymatic 
systems. Such vectors are described in tha U.S, Patent AppRcation entitled "High Throughput DNA Sequencing Vector' 
3D (GENSET.015A, Serial No. 09/058.7451, the disclosure of which is incorporated herein by reference. 

It will be appreciated that other subcloning methods familiar to those skilled in the art may dl$o be employed. 
The resulting subclones are then partially sotiuencsd using, for example, the procedures described below* 
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Example 4 
Partial snnuRncIna of RAC sabdong^ 
The genomic DNA inserts in the subclones, such as the BAG subclones prepared above, are amplified by 
conductnig PCR reactions on the overnight bacterial cuUnTCs, using primErs complementary to vector sequences flanking 
the insertions. 

Tha sequences of the insert extremities (nn average 5Q0 bases at each end, obtained under routina sequencing 
conditions) are deierniincd by fluorescent automoted sequencino on ABl 377 sequencers, using ARI Hi ism DNA 
Saquencino Analysis safiware. Followiny gel image analysis and Df^A sequence extraction, sequence tlula are 
automatically processed with adequate software to assess sequence quality. A proprietary base-caller. automaticifHy 
flags suspect peaks, taking mlo account tiie shape of the peaks, tha inter-peak rcsolutioa and the miise level. The 
praprietary basc-callor also pcrfomis an strtomatic trimming. Any siretdi of 25 or fewer bases h:jving mora than 4 suspect 
peaks is usually considered unreliable and Is discarded. 

The sequenced regions of the subclones, such as the BAC subclones prepared above, are then analyzed in 
order to identify blalielic markers lying therein. Hie frequency at which biallolic markers will be detected in the 
screening process varies with the average level of heterozygosity desired. For example, if bialltilic markers having an 
average hctcroiygosity rate of greater than 0.42 arc desired, they will occur every Z5 tu 3 kb on average, Therefere, 
on average, six 500 bp*genomic fragments have to be scn;cncd in order to derive 1 biallclic marker having an adequate 
informative content. 

As a preferred alternative to sequencing the ends of an adequate number of BAC subclones, the above 
mentioned high throughput deletion-based sequencing vectors, which allov/ tha generation ol a high quality sequence 
information covering fragments of ca. Gkb, may be usei Having sequence fragments longer than Z5 or 3kb enhances 
tho chances of identifying hiallelic markers therein. Methods of constructing and sequencing a nested set of deletions 
are disclosed in the U.S. Patent Application entitled 'High Throughput DMA Sequencing Vector" (GENSET.015A, Serial 
No. O2J058,746)r the disclosure of wt»ich is incorporated herein by reference. 

To identify biallelic markers using partial sequence inforraation derived from subclone ends, 
such as the ends of the BAC subclones prepared above, pairs of primers, each one specifically 
defining a 500 bp aniplification fragment, are designed using the above mentioned partial sequences. 
The primers used for the genomic amplification of fragments derived from the subclones, such as 
the BAC subclones prepared above, may be desiped using the OSP software (HilHer L. tmd Green 
P., Methods AppL^ 1:124-8 (1991), the disclosure of which is incorporated herein by reference). The 
GC conteirt of the amplification primers preferably ranges between 10 and 75 %, more preferably 
between 35 and 60 %, and most preferably between 40 and 55 %. The length of amplification 
primers can range from 10 to 100 nucleotides, preferably from 10 to 50, 10 to 30 or more preferably 
10 to 20 nucleotides. Shorter primers tend to lack specificity for a target nucleic acid sequence and 
generally require cooler temperatures to form sufficiently stable hybrid complexes with tlie 
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templatc. Longer primers are expensive to produce and can sometimes scIf-hybridLzc to fomi hairpin 
structures. 

Ml primGrs may contain, upstream of the specific target bnscs, 3 common olioniiuclcolide tail ttiat serves as a 
sequencing primnr. Those skilled in the art arc familiar with primer citcnsions which csn bu used for ihese purposes. 

To i[lenli(y biallelic mnrkeis, the aeqiiunccs cflirssponding to tiw [jaitial sequences detennireci above arc 
determined and compared in a plurality cf Individuals. Tiw population used to identify biallelic markers having an 
adequate informative content preferably consists of ca, 100 unrelated individuals from a hclerogcneous population. 

First, DNA Is extracted from the peripheral venous blood of each donor using methods such as those described 
in Example 5. 

Example 5 
Eytractlon of DNA 

30 ml of blood are taken from the individuals in the presence of EDTA. Cells (pellet) arc collected after 
centrifugalion for 10 minutes at 20OO rpm. Red cells ere lyscd by a lysis sulution (50 ml final volume : 10 mM Tris 
pli7.6; 5 mM MgClj; 10 mM NaCI). The solution is cenlrifuged (10 minutes, 2000 rpm) as many times as nccessaiY to 
eliminate the residual red cells present in the supernatant, after rosuspcnsion of the pallet in the lysis solution. 

The pellet of white cells is lysed ovornight at 42**C with 3.7 ml of lysis solution composed of: 

- 3 ml TE 1 0-2 (Tris-HC1 1 0 mM. EDTA 2 mM) / NaCI 0.4 M 
•20D/;I SOS 10% 

- 500 }A K-protcinaso (2 mg K-protcinasc in TE 10*2 / NaCI 0.4 Ml. 

For Ihe e:rtraction of proteins, 1 ml saturated NaCI {6M) (1/3.5 v/v) is added. After vigorous auitatinn, the 
solution is centrifuged for 20 minutes at 1 0000 rpm, * ' 

For the precipitation of DNA, 2 to 3 volumes of 1 00% cthanoi arc added to the previous supernatant, and the solution is 
centrifuged for 30 minutes at 2000 rpm. The DNA solution is rinsed three limes witii 705a ethanol to eliminatii salts, 
and centrifuged for 20 minutes at 2000 rpm. The pellet is dried at 37'C, and resuspended in 1 ml TE 10-1 or 1 ml 
water. The DMA concentration is evaluated by measuring the OD at 260 nm (1 unit OD - 50 //g/ml DNA). 

To evaluate the presence of proteins in the DNA solution, the OD 250 / OD 280 ratio is determined. Only DNA 
preparations having a DD 260 f OD 280 ratio between 1.8 and 2 are used in the subsequent steps described below. 

Once genomic DNA irom every individual in the given population has been ertracted, it is preferred that a 
fraction qf fiach DNA sample 5s separated, after which a pool of DNA is constituted by assembling CTiuivalent DNA 
amounts of the separated fractions into a single one. 

Second, the DNA obtained from peripheral blood as described above is amplified using the above mentioned 
amplification prinnars. 

Example 6 provides procsdurcs that may bo used in the amplification reactions^ and tho detoction of 
polymorphisms within the obtained araplicons. 



wo 99/04038 




PCTAB98/01193 



*23' 
Ertimnle 6 

Ampiificatmn of DWA frnm PeriohcrnlHlood 
and Idsntif icatinn of BkMk Markers 
Tha amplification of each sequence is performed on pooled ONA samples obtainctl as in Example 5 above, using 
5 pen (Polyincrasg Chdn Reaction) as fcliows: 



• final voluma 25 ;A 

• genomic DNA 2ngi//l 
-MgClz 2mfA 
-dIOTPtcach) 200 //M 

10 - primcf (cncli) 19nu/;/l 

• Ampli Taq Gold DMA polymerase (Perkin) 0.05 unit/pl 



-PCR buffsf (10X-Q.1 M Tris HC| pH 9.3, 0.5 M KG) IX. 

The synllicsis of primers is performed following the phospliorainiditc metiiod, on a 
GENSET QFPS 24.1 synthesizer, 
IS To reduce the expense of preparing amplification primers for use in the above procedures, short primors may be 

used. White primers and probes having between IS and 20 (or more) nucleotides are usually higlily specific to a given 
nudetc acid seqccncc, it may be inconvenient and eipensh/e to synthesize a relatively long ojigonuclcotiile for each 
analysis. In order to at least pariiaNy circumvent (his problem, it is often possible to use smaller but still relatively 
specific digonudeolidcs that are shorter tn length to create a manatjoable library. For example, o library of 
20 oligonuclBotidcs comprising about 8 to 10 nucleotides \i ccnceivoWc and has already been used for sequencing of a 

40,000 hp cosmid DNA (Studior, Proc. mn Acad. Sd USA 66[181:5gi 7-6921 {\mi th(i disclosure of which is 
incorporated herein by reference). 

Another potential way ta obtain specific primers and probes with a small library of oiigonijcleotides is to 
generate longer, more specific primers and probes from combinations of shorter, less specific oligonucleotides. Libraries 
25 of shorter oligonuclGotidcs, each one being from about five to eight nucleotides in length, have already been used 

(Kieieczawa et al.. Scknco 258:1787-1791 11832); Koller et aL, Proc NotL Acdd. Sd USA 90:42414245 (1993); 
Kaczorowski and Srybalski, A/ioI BfacJwm 221:127-135 (1934}, the disclosures of which are incorporated herein by 
fcfereflcel. Suitable probes and primers of epprnpriate length can therefore be designed Ihraugh the association of two 
or three shorter oligonucleotides to constitute modular primers. The association between primers can be either covalent 
3D resulting from the activity of DNA T4 figase cr non-ccvalsnt through base-slacking energy. 

The amplification is performed on a Perkin Elmer 9600 Thermocyder or MJ Research PTC200 with heating lid. 
After healing at 95"*C for 10 minutes, 4Q cycles are performcdL Each cycle comprises: 30 sec at SS^'C, 1 minute at 
54^a and 30 sec at 72**C. For final elongation, 10 minutej at 72°C ends the amplification. 

The quantities of the ampBficalion products obtained are detennined on 96-weil microtiler plates, using a 
35 f luorimeter and Picogreen as intercalating agent (Molecular Probes). 
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The sequences of the ampRfication products are determined using automated dideoxy terminator sequencing 
rgactions with a dyef rimer cycle ssqucncing protocol. The products of the ssquencing reactions are run on sequencing 
gels and the sequences are determined using gei image analysis. 

The SLMiueocs data are evaluated using software ttesigned to detect (he presence of bialldic sites among tlie 
pooled amplified fragments. The polymorphism search is based on the presence of superimposed peaks in the 
electrophoresis pattern resuliing from different bases occurring nt the same position. Because each dideoxy terminator 
is labeled with a different fluorescent molecule, the two peaks corresponding to a bioHelic site present distinct colors 
coriusponding to two different nucleotides at the same position on (he sequence. The software cvalnatcs the intensity 
ratio between the two peaks and the intensity ratio between a flivon peak and surrounding peaks of the samu niilor. 

However, the prcscnca of two peaks can be an artifact duo to background noise. To exclude such an artifact, 
the two DNA strands are sequenced ajid a comparison between tlie peaks is carried out. in order to be ragistared as a 
polymorpluc sequencCi tiie pelymorphism has to be detected on both strands. 

The above procedure permits those amplification products which contain bialisiic markers to be identified. 
The detection limit for the frequency of biallelic polymorpliisms detected by sequencing pools of 100 
individuals is about 10% for ttic minor altsle, as verified by sequencing pools of known allelic frequencies. However, 
more than 90% of tlie btallelic polymorphisms detected by the pooling method have a frequency for the minor allele 
higher than 25%. Therefore, the biallelic markers selected by this method have a frequency of at least 10% for the minor 
allele and 90% or less for the major allele^ preferably at least 20% for the minor oHeic ond 80% or less for the major 
allele, more preferably at least 30% for the minor allele and 70% or less for the major allele, thus a hctcrozygnsity rate 
higher than 0,1 8, preferably higher than 0,32, more preferably liigher than 0.42. 

In an initial study to determine the frequsncy of bialiciic markers in the human genome that can be obtained 
using the above methods the following results were obtained. 300 different ampiicons derived Irom 100 individuals, and 
covering a total of 150 kb obtained from different genomic regions, were sequenced. A total of 54 bialieb'c 
polymorphisms were identified, Indicating that there is one biallelic pclymorphism with a heterozygosity rate higher than 
25 0.18 (frequency of the minor allele higher than 10%l preferably higher than 0.38 (frequency of the minor allele highGr 

than 25%), every Z5 to 3 kb. Given that the human genome is about 3,10^ kb long, this indicates that, out of the 1 0^ 
biallelic markers present on the human genome, approximately 10^ have adequate heterozygosity rates for genetic 
mapping purposes. 

Using the procedures of Examples V5, sets contairung increasing numbers of biallelic markers may be 
30 constructed. For example, the procedures of Examples 1-6 are used to identify 1 to about 50 bialleiic markers. In some 
embodiments, the procedures of Examples V6 are used to identify about 50 to about 200 biallelic markers. In other 
embodiments, the procedures of Examples 1-6 are used to identify about 200 to about 500 biallelic markers- In some 
embodiments, the procedures of Examples 1-6 are used to identify about 1,000 biallelic markers. In other embodiments, 
the procedures of Examples V6 are used to identify about 3,000 biallelic markers, !n further emhadiments, the 
35 procedures of Examples V5 are trsed to identify about 5,000 biatleBc markers. In another embodiment, the procedures 
of Examples 1-6 are used to identify about 10,000 biallelic markers. In still another embodiment, the procedures of 
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Exampiss 1-5 are used to idGniify about 20,000 faiallelic markers. In still another embodiment, ths procedures of 
Examples V6 are used to identify obout 40.000 biallelic markers. In still another cmbfldimont. the proctiriurcs of 
Examples 1-6 arc used to identify about 60,000, biailciic markers. In stifl another embodiment, tim procedures of 
Examples 1-0 ore used to iiientify about 80,000 biallelic markers. In a stiil onoliiLT embodiment, the procedures of 
Examples 1-6 arc used to identify more than 100,000 biallelic markers. In a further embodiment, the procntlures of 
Examples 1-8 arc used to identify more than 120,000 bialfelic markers. 

As discussed above, the ordered nadoic acids, sucli as the insens in BAG dunes, which contain thi; Iiiallelio 
markers of the present invention may span a portion of the genome. Fur example, the ordered nucleic acids may spatj at 
least lOOkb of contiguous genomic DNA, at least 250kb of contiguous gcnmnic DMA, at least 500kb of contiguous 
genomic DNA, at laast 2Mb of contiguous Qcnomic DMA, at least 5Mb of cDnliguous genomic ON A, at least 10Mb of 
contiouuus gertomic ONA, or at least 20Mb of ccntiouous genomic DNA, 

In addition, groups of biallelic markers located in proximity to one another obny the jjefiumc may be identified 
within these portions of the genome for use in haplotyping analyses as described below. Tlie biallelic markers included 
in each of these groups may be located within a genomic region spannir^g Jess than Ikb, from 1 to 5kb. from 5 to ICJkb, 
from 10 to 25kb, from 25 to BOkb, from 50 to 150kb, from 150 to 250kb, Irom 250 to 5QQkb, from 500kb to 1Mb. or 
more than 1Mb. It will be appreciated that tlie ardered DNA fragments containing these groups of biallelic markers need not 
completely cover the genomic regions of these lengths but may instead be incomplete conligs having one oc more gaps 
therein. As tfiscussed in furtJicr detail below, biaUcfic markers may be used in single maker and haplotypc association 
analyses regardless of the completeness of the corresponding physical contia harboring tiicm. 

Using the procedures above, 653 biallelic markers, each having two alleles, were identified using sequences 
obtained from BACs which had been localized on the genome. In some cases, markers wcri identified using pooled B AGs 
and thereafter reassigned to individual BACs using STS screening procedures sucii as those described in Examples 2 and 
7, The sequenCES of 50 of theso 653 biaQcIic markers are provided in the accompanying Sequence Listing as SEQ ID 
Nos, 1-50 and 5M0O (with SEQ ID Nos. 1-50 facing one anele of these 50 biallelic markers and SEO ID Nos. 5M00 
being the other allele of these 50 biallelic markers). Although the sequences of SEQ ID Nos. 1-50 and 5M0Q will be 
used as exemplary markers throughout the present application, it will bs appreciated that the biallelic markers used in 
the maps of the present invention are not linutcd to these particular markers, nor arc they limited to having the exact 
flanking sequences surrounding the polymorphic bases which arc enumerated in SEQ ID Nos. 1-50 and 5 VI 00 Rather, 
it wHl be appredated that the flanking sequences surrounding the polymorphic bases of SEQ ID Nos. 1-50 and 5M0Q 
may be lengthened or shortened to any extent compatible with their intended use and the present invention specifically 
contemplates such sequences. The sequences of these 653 biallelic markers, including the sequences of SEQ ID Nos. 1- 
50 and 51-100 may be used to construct the maps of the present invention as well as in the gene identification and 
diagnostic techniques descrlfaEd herein, tt will be appreciated that the biallelic markers referred to herein may be of any 
length compatible with their intended use provided that the markers include the polymorphic base, and the present 
invention specifically contemplates such sequences. 



wo 99/04038 



PCT/IB98/01193 



-2S. 

Ordering of biollGHn markers 
Biallclic markers can be ortiGrDd to detennino their positions along chromosomes, preferably subchromosomal 
regions, most preferably along the above described minimally overlapping ordered BAG arrays, as follows. 

Tha positions of the biallclic markers along chromosamcs may be dctcnninod using a variety of mnthodolooics. 
In one approach, radiation hybrid mapping is used. Radiation hybtid {0111 mapping is a somatic cell [jcnetir. approach that 
can be used lor high resolution mapiiiiig of the bmm tjnnoms. In Ifiis approach, cell lines containing one or mm human 
chromusornes arc IclhaBy tfradioted, breaking each chfomosome into fragmunls whose size rinpmids on tha radialiun dose. 
Tlicse fragraems are rescued by fusion with cultureif rpdent cells. yi*Iiug subclones containing diffcicnt portions af the 
human genome. This ledinique is dcsciibed by Benhom ct aL [Genomics 4:509-517, 1S89) and Cux ct aL {SdcfWfi 
250:245-250, 19301 tlie entire contents of which ore hereby incorporated by reference. The random and independent 
natufa of the subdoncs permits cfHuient mapping of any fiuman oenume marker. Human DNA isulatod from a panel uf 80- 
100 cell lines provides a mapping reagent for ordering bialiolic markers. In this approach, the frequency of breakage 
between markers is used to measure distance, alowing construction of fine resolution maps as has been donc! fur ESTs 
{Schuler et al., Sdancs 274:540-545, 1936, hereby incorpuratod by reference), 

fJH mapping has been used to generate a high-rcsoltition whole genome radiation hybrid map of human 
cliromoscme 17q22-q25.3 across the genes for growth hormone (611) and thymidine kinase [TIC) [Foster et aL, Genomics 
33:1fi5*192, 19951 tlm region surroundinu tffi Gorlui syndrome gene (Obermayr ct al, Eur. J. Num. CcnaL 4:242-245, 
1995), GO loci covering the entire short arm of chromosome 12 IRaeymaekers et ai., Gawmlcs 29;170-178, 1995), the 
ragion of human chromosome 22 containing the neurofibromatosis type 2 locus (Frazer et al., Genomics 14:574-504, 1992) 
and 13 loci on the long arm of chromosome 5 (Warrington at aL, Genomics 1 1:701-7Q3, 1391). 

AlterDath/eiy, PGR based techniques and human-rodent somatic cell hybrids tnay be used to daterminc llic 
positions of the bialleDc markers on the chromosomes, la such approaches, oligonucleotide primer pairs which ara capable of 
generating amplification products containing the polymorphic bases of the blallefic markers are dcsignGi Preferably, the 
oligonucleo^de primers are 18-23 bp in length and ere designed for PGR amplification. The creation of PGR primars from 
known sequences is well known to those with skill in the art For a review of PGR teclinology see Erlich, HA, PGR 
Tcchnalogv; Principles and Applications for DMA ftmnlification . 1992. W.H. Freeman and Co,, New Ycrk- 

The primers are used in polymerase chain reactions (PGR) to amplify templates from total human genomic OtJA. 
PGR conditions are as follows: 60 ng of genomic DNA is used as a template for PGR with 80 ng of each oligonudBotide 
primer, 0,8 unit of Taq polymorasa, and 1 ^Cu of a ^^P-labe1ed dcorycytldine triphosphate. Tha PCfl is performed in a 
micrupiate thefmocycler fTechne) under the following conditions: 30 cycles ct 94^C, 1.4 min; 55**C, 2 min: and ITZ, 2 min; 
with a final eitension at 72"C for 10 mk The amplified products are analyzed on a 6% polyacrylamitlc sequencing gel and 
visuaized by autoradiography. H the length of the resulting PGR product is identical to the length Cipccted for an 
amplification product containing the polymorphic base of the biallelic marker, than tha PGR reaction is repeated with DNA 
templates from two panels of human-rodent somatic tell hybrids, BIOS PCRable DNA (BIOS Corporation) and N18MS 
Human-Rodent Somatic Cell Hybrid flapping Panel Number 1 (NlGMS, Camden, NJ), 
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PGR is used to screen z series of somatic cell hybrid cell lines coniaining dsfincd sets of human chrDmasomes for 
the presence of a given bialldic marker. DNA is isolated from tlm somatic hybrids and used as startlno templates for PCfl 
reactions using the primer pairs from the biailelic marker. Only those somatic cell hybrids with chromosomes containing tha 
human sequence coircsponding to tlie biailelic marker wil yield an amplified fragment. The biidlclic maikors are assioncif lo 
a diromosomc by analysis of the siioregation patteni of PCR products from the somatic hybrid DNA templates. The singla 
human chromosome present in all cull hybrids that yivc rise to an amplifted fragment is the chrnmosome conlainino that 
bialfclic marker. For a review of techniques and onalysis of results from sornatic cell gene mapping tixperiments. (Stiu 
Ledbetlcr el al, Genomics 6:475-481 (1890U 

Example 7 describes a preferred mclhoil for positioning of biailiilic markers on clones, such as BAG clones, 
obtained from genomic DNA libraries. 

Example 7 

Screening RAH libr artus with bfeilgiic markers 
Amplification primers enaliiiiig the specific amplification of DNA fragments cariYino the biailelic markers (including 
the 653 biaHclic markers obtained above (which include the sequences of SEQ 10 Nos 1-50 and 5M00) may be used to 
screen clones in any genomic DNA library, preferably the BAG libraries described above for the presence of the bialldic 
markers. 

Pairs of primers were designed which allowed the amplification of fragments carrying the 653 biailcfic markers 
obtained above. The amplificatian primers may be used to screen clones in a genomic DNA library for tha presence of the 
G53 biailelic markers. For example, pairs of amplification primers of SEQ ID Nos. 101*t50 and 151-200 may be used to 
amplify fragments which include the polymorphic bases of tha biailelic markers of SEQ ID Nos. 1-50 and 5M0Q. 

U wiit be appreciated that amplification primers for tho biailelic markers may be any sequences which allow the 
specific amplification of any DNA fragment carrying tlie markers and may be designed using techniques familiar to those 
skilled in tha art. The ampfification primers may be oligonucleotides of 8, 10, 15, 20 or more bases in length which 
enable the amplification of any fragment carrying the polymorphic site in the markers. The polymorphic base may bs rn 
the center of the amplification product ur, alternatively, it may be located off-center. For example, in some 
embodiments, the smpbfication product produced using these primers may be at least 100 bases in length (i.e. 50 
nucleotides on each side of the polymorphic base in ampQfication products in which tha polymorphic base is centrnlly 
located). In other embodiments, the amplification product produced using these primers may be at least 500 bases in 
length (i.e. 250 nucleotides on each side of the polymorphic base in amplification products in which the polymorphic base 
is centrally located). In still further embodiments, the amplification product produced using these primers may be at 
least 1000 bases in length fue. 500 nucleotides on each side of the polymorpliic base in amplification products in which 
the polymorphic base is centrally located). Amplification primers such as those described above are included within the 
scope of the present invention. 

The locafization of hiallertc markers on BAG clones is performed essentially as described in Example 2. 
The BAD clones to be screened are distributed in three dimensional pools as described in Example 2. 
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Amplification reactions arc conducted on the pooled BAG clones using primers specific for tha biollelic markers 
to identify DAC clones which contain the blailolic markers, using procedures essentially similar lo those described in 
Example 2. 

Amplification products resulting from tho ampOfication reactions are detected by cnnvnntional agarose gal 
Gloctrcphorcsis combined willi automatic image capturino and processing. PCH screenino for a biallolic marker involves 
three steps: (1] identifying the positive priniriry pools; (2) for each positive primary pools, idcnlifyina tha positive plate, 
row snd column 'subpools' to obtain the address of the positive clone; (3) directly confirming the PGR assay on Ihe 
identifiud clone. PGR nssa/s are perforniud with primers dcfininn the biallclic marker, 

Scicening is conduciod as follows. First BAG DNA is isolated as follows. Bacteria contuitung tha genomic 
inserts arc growi overnight at 37**C in MQp\ of LB coniaiiung chloramphcnicul {12 ^glml), DNA is extracted by the 
following protocol: 

Cuntfifugo 1 0 min at 4^*0 and 2000 rpm 

Eliminate supernatant and resuspcnd pellet in 1 20 fj\ TE 1 0-2 fTris IICM G mM, EDTA 2 mM) 
Centrifuge 10 min at 4''C and 2000 rpm 

EHminaiB supernatant and incubate poltet with 20 pi lyzozymc 1 mg/ml during 15 min at room temperature 
Add 20//! proteinase K lOO;:/g/ml and incubate 15 min at EO°C 
Add 8 p\ DNAsG 2U/pl and incubate 1 hr at room temperature 
Add miATEW-l and keep at -80° C 



PCR assays are performed using the following protocul: 

Final volume , -1 5 //I 

BACONA 17ng///l 

Mga? 2niM 

dNTP(Bach) 200 pM 

primer (each) 2,9 n^lfj\ 

AmpFi Taq Gold DNA polymrase 0,05 unitl/zl 
PCR buffer (lOx -0.1 MTrisHClpH8.3 0.5M KCl 1x 



The ampliFicalion is performed on a Genius II thermocycier. After healing at 95'' G for 10 min, 40 cycles arc 
perfonned. Each cycle cnmprisas: 30 sec at 95''C, 54^*0 for 1 min, and 30 sec at 72°C. For final elongation, 1 0 min at 
72'*C end the amplincation. PCR products are analyzed on 1% agarose get with 0.1 mg/ml ethidium bromide. 

Using such procedures, a number of BAG clones carrying selected biallclic markers can be isolated. Tiic 
position of these BAC clones mi the human genome can be defined by performing SIS screening as described in Example 
2. Preferably, to decrease the rtumber of STSs to be tested/ each BAC can be localized on chramnsumal or 
subchromosomai regions by procedures such as those described in Examples 6 and 9 below. This localization will allow 
the selection of a subset of STSs corresponding to the identified chromosomal or subchromosomal region. Testing each 
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BAC with such a subset of STSs and talcing account of the position and order of the STSs along the gEnome will allow a 
refined positioning of the corrosponding biallelic marker along the gGnome, 

In oiticr embodtmcms, if the DNA library used to isolate BAC inserts or any type of genomic DMA fragments 
hprboring the selected bialMic markers alread/ constitute a physic^ map of llic genome or any portion thereof, using the 
known order of tlie DMA fragments will allow the order of the binllclic markers to be nstahlished. 

As discussed above, rt will bo apprccbEcd that markers carried hy the same frayinent of genomic DNA, such as 
the insert in a BAC cloiio, need not necessarily be ordnrcd with respuc! to one another within iJic genomic friitjinont to 
conduct single point or haplotypc association analysijs. However, in oiUm embodiments of tliu present maps, thn order of 
biallulic markers carried by tfie same fragment of genomic DfJA may be detenniiied. 

The positions of the biallclic markers used to construct the maps of the present invention, incitiding the 6U3 
biallelic markers obtained above, may be assigned to subchromosomai locations using Fluuresccnce In Situ Hybridi;atiDn 
(FISH) (Cherif ct al^ Proc. NatL Acad. ScL USA, 87:6639-6643 (1990), tlie disclosure of which is incorpgralud herein by 
reference). RSH analysis is described in Exampb 6 below. 

Eyomntti R 

Assignmnnt nf Biallelic Markers to Subdirnmnsnmal Regions 
Matophase cliromosomes ore prepared from phytohcmagglutinin iPflA)-st]muIat8d blood cell donors. PflA- 
stimulated lymphocytes from healtliy malas arc cultured for 72 h In RPMM 540 medium. For synchronizalioa mcthotrciate 
no jaM) is added for 17 h, followed by addition of 5-bromDileaxyuridine (5-BudR, 0.1 mM) for 6 h, Colcemid (1 |.tgfmi) is 
added for the last 15 min before harvesting the cells. Cells are collected, washed in RPMI, incubated with o hypotonic 
sohjtion of KC! [75 mM) at 21% for 15 min and fuced in tlirec changes of mclhonohanetic acid (3:1). Tho ceil suspension is 
dropped onto a glass slide and air-dried. 

BAC denes carrying the biallelic markers used to construct the maps of the present invention [including the 653 
biaBeltc markers obtained abcveto) can be isolated as described above. Those BACs or portions therEof, inclucfmg iragments 
2S carrying said bianelic markers, obtained for example from ampfification reactions using pairs of DmpIiftcatiDn primers as 

dcscrihod above, can be used as probes to be hybridized whh metapbasic chromosomes. It wl be appreciated that the 
hybndization probes to be used in the contemplated method may he generated using alternative methods well known to 
those skilled in the art Hybridization probes may have any length suitabla for this intended purpose. 

Probes are then labeled with biolin-16 dUTP by nick translation according to the manufacturer's Instructions 
30 (Bethesda Research Laboratories, Bethesda, UUl purified using a Sephadex G-50 column (Pharmacia, Upssals, Sweden) and 
precipitnted. Jusi prior to hybridization, tho DNA peli&t is dissolved in hybridization buffer (50% formamide, 2 X SSC, 10% 
deitran sulfate, 1 mglml sonicated saknon spem^t DNA, pH 7) and the probe is denatured at 70^C for 5-1 0 mtn. 

Slides kept at -IQ^'C are treated for 1 h at 2TZ with RNase A (100 figlml), rinsed three times in 2 X SSC and 
dehydrated in an ethanol series. Chromosome preparations are denatured in 70% fomiamida. 2 X SSC for 2 min at 70°C, 
35 then dehydrated at 4*'C. The sOdes are treated with proteinase K (ID ^g/lOO ml In 20 mM Tris-HCI, 2 mM CaClj) at 37*^0 
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for 8 min and dEhydraied, "Hm hybridization mixture containing the probs is placed m Ihe slide, covered with a covcrslip, 
sealed with rubber cemont and innuLatod overnight in a humid chamber at ST'C. After hyhridization and post-liybridization 
washes, the biatinylatcd probe is detected by avidin-FlTC and amplified with additional layers of biottnyiatnd goat anti-aviilin 
and avidin-FrrC. For chromasumal localization, fluorescent B-bands are obtained as previously described {Chcrif cl al.,(1990) 
supr?.]' The slides arc observed under a LEICA fluorescence microscope (DMRXA). Chrornasomos are countcrstaincd with 
propidium iodide and the fluorescent signal of the probe appears as two symmetrical ycllnw-grecn spots on both ciir omatlds 
of Win fluorescent R-band chromosome (redj. Thus, a particular binllclic marker may be lecaliiEjd to a particular cylogcnetic 
R-band on a givan chromosomfi. 

The above proccdtire was used to conrrnn the subchromosomal locniiun of 95% of the BAC clones harboring the 
653 markers obtained above. In particular, the 50 markers of SEQ ID Nos. 1-50 and 5M00 were assigned to 
subchromosomal regions of chromosome 21. Simple identification numbers were attributed to each BAC from which the 
markers are derived. Rgure 1 is a cytogenetic map of chromosome 21 indjcaling l3io subchromosomal regions therein. Table 
1 Gsts llie internal identification number of the localized biaHelic markers, the interna! idontincation number of the B ACs from 
which tlic markers were derived, the siza of the BAC nscrt, tha average intcrm^rkcr distance in ilio GAC insea and tlic 
subchromosomal locations of the Liolleiic markers, The sequences of the bcalized markers arc provided as SEQ ID Nos. 1-50 
and 51-100 in the accompanying sequence fisting. Amplification primers for genGratJng ampliffcation products contnining 
the polymorphic bases of these markers arc also provided as SEQ ID Nos. 10M50 and 15V200 in iho accompanying 
sequence fisting. Mlcrosequcncing primers for use in determining the identitias of the potymorphic bases of these biallelic 
markers arc provided in the accompanying Sequence Listing as SEQ 10 Nos. 201-250 and 251-300. 

The rate at which biallelic markers may be assigned to subdiromusemal regions may be enhanced through 
automation. For eiample, probe preparation may be performed in a micrctiter piatc format, using adequate robots. The rote 
at which biaHelic markers may be assigned to subchromosomal regions may be enhanced using techniques which permit the 
in situ hybridization of multiple probes on a single miacscope slida, such as those disclosed in Larin et al, Nucleic Acids 
Research 22: 3583-3592 (1394), the disclosure of which is incorporated herein by reference. In the largest test fomiac 
descrSsed, different probes were hybridized simultaneously by applying them directly from a 96-wcll microtiter dish which 
was inverted on a glass plate. Software for image data aquisition and analysis that is adapted to each optical system, test 
fomiat, and fluorescent probe used, can be derived from the system described in Uchtcr ct al. Science 247: {1990), 
the disdosure of which is incorporated herein by reference. Such software measures the relative distance between the 
center of the fluorescent spot corresponding to the hybridtzed proba and the telomeric end of the short arm of the 
corresponding chromosome, as compared to the total length of the chromasnme. The rate at which biallelic markers are 
assigned to subchromosomal locations may be further enhanced by simultaneously applying probes labeled with different 
flouorescent tags to each weB of the 96 weS dish. A further benefit of conducting the analysis on one slide is that it 
facitates automation, since a microscope having a moving stage and the capability of detecting fluorescent signals in 
different metaphasa chromasomes could provide the coordenates of each proba on the metaphase chromosomes distributed 
on the 96 welldish. 

Example 9 below describes an alternative mathod to position biallelic markers which allows their assignment to 
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human chromosomes. 



Exampia 9 

AssiQnmcnt_nf Riglieitc Markf^rs to Human Chrofnn.^nrii8S 



The biallalic markers used io construct the (naps of the present invsntion« including the 653 bialklic markers 
obtained above {which include the scqucnccj of SCQ ID Nos, 1-50 and 5M00h may be assigned to a human 
chromosome using monosomal analysis as described below. 

The chromosomal locoltt^tion of a biallelic marker can be peiformi;d through the use of somatic cfill hybrid 
panels. For example 24 panels, each panel containing a different human chromosome, may bo lisetl IRussoli at aL, 
$Qm^( CsHMoL Genet 22:425-431 (193B); Diwinga el Genomics 16:3)1.314 11993), thu disclosures of which are 
incorporated herein by reference). 

The bialtelic markers are localized as follows. TTie DNA of each somatic call hybrid is extracted and purified. 
Genomic DNA samples from a somatic cell hybrid panel are prepared as follows. Cells arc lysed overnight at 42''C with 
3.7 ml of lysis solution composed of: 

3 ml TE 10^2 (Tris HCi 10 mM, EDTA 2 mM) / NaCI 0.4 M 

200;;1SDS10% 

500 ;4 K'proteinasa (2 mg K-proteinase in TE 10-2 / NaCf 0.4 M) 

For the extraction of proteins, 1 ml saturated NaCI (GMl (1/3.5 vM is added. After vijofous agitation, the 
solution is ccntrifuged for 20 min at 10,000 rpm. For tha precipitation of DNA, 2 to 3 volumes of TOO % ctlianol are 
oddsd to the previous supernatant and the solution ts centrifuged for 30 min at 2,000 rpm. Tho DriA solution is rinsed 
three times with 70 % ethanol to eliminate salts, and centrifuoed for 20 min at 2,00Q rpm. The pellet is dried at 37°C, 
and resuspcnded in 1 ml TE 10-1 or 1 ml water. The DMA conceatration is evaluated by measuring the CD at 260 nm (1 
unit 00 - 50 ;ig/ml ONA), To determine the presence ci proteins in the DHA solution, the OD260/OD230 ''atio is 
determined. Only DNA preparations having a OD2^0B2so ^^^^ between 1.8 and 2 are used in tha PGR assay. 

Then, a PCR assay is performed on ^mnm DNA with primers defining tha biallelic marker. The PGR assay is 
performed as described above for BAG screening. The PCFI products are analyzed on a 1% agarose gel containing 0.2 
mglml ethidium bromide. 

The ordering analyses described above may be conducted to generate an integrated genome wide genetic map 
comprising about 20,000 Kaflelic markers (1 biallelic marker per BAG if 20,000 BAG inserts are screened}. In some 
embodiments, tha map includes one or more of the 653 markers obtained above (which include the sequences of SEQ ID 
Nos. 1-50 and 5M00 or the sequences complementary thereto). 

In another emfaodirrtent, the above procedures are conducted to generate a map comprising about 40,000 
markers (an average of 2 biallelic markers per BAG if 20.000 BAG inserts are screened). In soma embodiments, the map 
includes one or mora of the 653 markers obtained above (which include the sequences of SEQ 10 Nos. V50 and 5V100 
or the sequences complementary thereto). 

In a further embodiment preferred cmhodiment, the above procedures are conducted to generate a map 



wo 99/04038 




PCT/IB98/01193 



comprising about BQ,000 markers { an average of 3 biallelic markers per BAC if 20,000 BAG inserts arc screened). In 
soma embodiments, the map mcludcs one or more of the 653 markers obtained above (which include the sequences of 
SEQ 10 Nos. 1-50 and 5M00 or the sequences complementnry thereto). 

In a further embotiiment prefsrrad embodiment the above proceduras are conducted to oenerata a map 
comprising about 80,000 markers (an average of 4 biailofic myrkers per BAC if 20,000 BAC inserts are scracnud). In 
some embodiments, the map includes one cr more uf the 653 markers obtained above (which include the suquoncei of 
SEQ ID Nos, V50 and 5M00 or tho sequences complementary thereto). 

In yet anotliur embodiment, the above procedures are conducted to generate a mop comprising about 1O0.000 
markers {an average of 5 bialleiic markers per BAC if 20,000 BAC inserts aro screcncdK In some embodiments, the map 
includes one or more of the 553 markers obtancd above (which include tiie sequences of SEQ ID Nos. 1*50 and 5M00 
er the sequences comploinonlary thereto). 

In 3 further embodiment, the above procedures arc conducted to generate a map comprising about 120,000 
markers (an average of 5 biallBBc markers per BAC if 20,000 OAC inserts arc screened). In some umbodimams, the map 
Includes one or more of the 053 markers obtained above (which include the sequences of SEQ (D Nos. 1-50 and 5 MOO 
or the sequences complementary thereto. 

Ahernatively, maps having the above-specified average numbers of bialleiic markers per BAC which comprise 
smaller portions of the genome, such as a set of chromosomes^ a single chromosome, a particular subchrcmoscmai 
region, or any other desired ponion of the genome, may also be constructed using the procedures provided herein. 

In somo omhodimcnts, the bialleiic markers in the map are separated from one another by an average distance 
of 10-200kb. In further embodiments, the bialleiic markers in the map are scparaied from one another by an average 
distance of 15-150kh, In yet another embodiment, the bialleiic markers in the map am separated from ons another by an 
average distance of ZO-lOflkb. In other embodiments, the bialleiic markers in the map are separated from one another 
by an average distance of lOO-IBOkb, In further embodiments, the btalieiic markers in the map aro separated from one 
another by an average distance of 50-100kb, In yet another embodiment, tho biaHelic markers in the map are separated 
from one another by an average distance of 25-50kb» Maps hav'mg the above-specified intcrmarker distances which 
comprise smaller portions of the genome, such as a set of chromosomes, a single chromosome, a particular 
subchromosomal region, or any other desired portion of the genome, may also be constructed using the procedures 
provided herein. 

Figure 2, showing the results of computer simulations of the distribution of inter-marker spacing on a randomly 
distributed set of biallcfic markers, indicates the percentage of biaScIic markers which wUI be spaced a given distance 
apart for a given number of markorsfBAC in the genomic map (assuming 20X00 BACs constituting a minimally 
overalapping array covering the entire genome are evaluated). One hundred iterations were performed far each 
simulation (20,000 marker map, 40,000 marker map, 60,000 marker map. 120,000 marker map). 

As illustrated m Rguro 23, 98% of inter-marker distances wilt be lower than IBOkb provided 60,000 evenly 
distributed markers are generated {3 per BAC); 90% of inter-marker distances will he lower than ISQkb provided 40,000 
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evenly distributed markers are generated (2 per BAG); and 50% of inter-markt>r distances will ba lowDr tfian 150kb 
provided 20,000 evenly distributed markers are generated (1 per BACJ. 

As illustrated in Figure 2b, %% of inter-marker distances will be lower than 80kb provided 120,000 evenly 
distributed markers are generated (6 per BAD); 80S of inter-marker distances will be lower than 80kb provided 60,000 
5 evenly distribirted markers arc generated (3 per BAC); and 15% of inter-marker distances will be lower than BOfcb 
provided 20,000 evenly distributed markers arc generated (1 per BAC). 

As already mentioned, higfi density bialldic marker maps allow association studies to be performed to identify 
genes involved in complex traits. 

Association studies examine the frequency of marker alleles in unrelated trait positive (T+) individuaLs 
1 0 compared with trait negative {T-} controls, and are generally employed in the detection ol polygenic inheritance. 

Association studies as a method of mapping genetic traits rely on the phenomenon of linkage disequilibrium, 
which is described below. 



Linka<|e Disegijilihrinm 

15 If two genetic loci lie on the same chromosome, then sets of alleles on liie same chromosomal segmeni (called 

haplotypes) tend to be transmitted as a block from generation to generation. When not broken up by rcconibination, 
haplotypes can be tracked not only through pedigrees but also through populations. The resulting phanomonon at tiie 
population level is that the occurrence of pairs of specific ellelss at different loci on the same chromosome is not 
random, and the deviation from random is called linkage disequilibrium (LD). 

20 *f a specific elieie in a given gene is directly involved in causing a panicular trait L its frequency will be 

statistically increased in a T-^ population when compared to the frequency in a T- population. As a consequence of the 
existence of LO, tha frequency of all other alleles present in the haplotypc carrying the trait-causing allele (TCA) will also 
ba increased in T+ indrviduais compared to T- individuals. Therefore, association between the trait and any allele in 
linkage disequilibrium with the trait-causing allele will suffice to suggest the presence of a trait-related gene in that 

25 panicuiar allele's region. Linkage disGquiiihrium aUows the relative frequencies in T+ and T- populations ef a limited 

number of genetic polymorphisms (specifically biallelic markers) to be analyzed as alternative to screening all possible 
functional polymorphisms in order to find trait-causing alleles. 

The present invention then also concerns biallelic markers in linkage disequilibrium with the specific biallelic 
markers described above and which are expected to present simHar characteristics In terms of their respective 

30 association with a given trait. In a prefortcd embodiment, the present invention concerns the biallelic markers that are in 
linkage disequilibrium with the 653 biallelic markers obtaiaod above iwhich include the sequences of SEQ ID Nos. 1-50 
and 51 -1 00 or the sequences complementary thereto). 

to among e set of biallelic markers having an adequate heterozygosity rate can be determined by genotyping 
between 50 and 1000 unrelated individuals, preferably between 75 and 200, more preferably around 100. Gcnotyping a 

35 biallelic marker consists of determining the specific allele carried by an individual at tha given polymorphic base of the 
biallelic marker. Genotyping can be performed using similar methods as those described above for the generation of the 
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Mallalic markers, or using other genotyping methods such as those further described heiow. 

LD between any pair of btallclic mariccrs comprisino al least m cf the bieltelic niurkcrs of Ihc presem 
invention {Mj.Mj} can be calcalatcd for every allele combmetion {M||,Mji ; U^yM^i ; ^[ir^ii M.^Mj^L nccording lo ihc 
Piazza formula : 

AM,i,Mjr VG4 W (94 -t- 03} (04 +02) , v^herc : 

94 ^ frequency of genotypes not having allele k at Mi and not having altele 1 at M, 

63- - *»- - frequency of genotypes net having allele k at Mj and haviny allele I at Mj 
62 • + • - frequency of genotypes having allele k at and not having alfcic I at M, 

Linkage disequilibrium (LD) between pairs of btallelic markers (Mi, Mj) can also be calculated for every allele 
comhinaiion [Mil^Mjl ; Mil,Mi2 ; Mi2.Mi1 ; Mi2.Mi2) accarding to the maximum likelihood estimate (MLE) for delta (the 
composite linkage disequilibrium coefficient), as described by Weir (8.S, Wiu Genetic Data Anelysh, (1996), Sinausr 
Ass. Eds, the disclosure of which is incorporated herein by reference). This formula allows linkage disequilibrium 
between alleles to be estimated when only genotype, and nut iiapblypB, data are available. This LD composite test 
makes no assumption for random maiing in the sampled population, and thus seems to be more appropriate than other 
LQ tests for gcnotypic data. 

The skilled person win readily appreciate that other LO calculation methods can be used wiliiout undue 
experimentation. 

Example 10 illustrates the measurement of LD bnlwccn a publicly known biallclic marker, the "ApoE Site A", 
located within the Alzheimer's related ApoE gene, and other biallclic markers randomly derived from the genomic region 
containing the ApoE gene. 



As originally reported by Strittmatter et aL and by Saunders et aL in 1993, the Apo E e4 allele is strongly 
associated with both late-cnset familial and sporadic Alzheimer's disease lADl (Saunders, A,M, Lancet 342: 710-711 
(1993) and Strittmater. WJ, et aU Proc. NatL Acad. Scl U,S.A. 90: 1977-1981 (1993), the disclusures of which are 
incorporated herein by refeiencel-Thc 3 major isoforms of human Apoiipoprotein E lapQE2, -£3, and -EA), as identified by 
isoelectric focusina, arc coded for by 3 alleles {z 2, 3, and 4). The e 2, e 3, and z 4 isoforms dilfer in amino arid 
sequence at 2 sites, residue 112 (called site A) and residue 158 (called site B). The ancestral isoform of the protein is 
Apa E3, which at sites A/B contains cysteine/arjinine, while Apo£2 and -E4 contain cysteine/cysteine and 
argtnine/argimne, respectively (Wcisgraber, K.H. et a!., J. Biol Chem. 256: 9077-9083 (1981); Rail, S.C, et aL, Proc, 
NatL Acad. Set- U.S. A, 79:46964700 (1982), the disclosures of which are incorporated herein by reference)* 

Apo E e 4 is currently considered as a major susceptibility risk factor for AD development In individuals of 
different ethnic groups (specidHy in Caucasians and Japanese compared to Hispanics or African Americans}, across all 
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ages between 40 and 90 years, and in both men and women, as reported recently in a study performed on 5330 AD 
patients and 8B07 controls (Farrcr et ttLJAMA 278:1349-1356 (1997), the disclosure of which i$ incorporated herein 
by raferencel. More specificslly, the frequency of a C base coding for arflininG 112 at site A is significnntly increased in 
AD patients. 

5 Although the mechanistic link between Apo E e 4 and neuronal degeneration characieristrc of AD remains to be 

established, current hypotheses sugocst that the Apo E genotype may influence neuronal vuinerabilily by increasing the 
deposition and/or aggregzition of the aniyloid beta peptide id \\k brain or by i/idtrcctly reduci{)g energy availability to 
neurons by promoting atherosclerosis. 

Usino the methods of the present invention, hiallclic markers that are in tho vicinity of the Apo E site A were 
10 flcnerated and the association of one of their alleles with Alzheimer's disease was analyzed. An Apo E public marker 
(stSG94) was used to screen a humnn genome BAC library as previously described, A BAG, whidi gsve a unique FISH 
hybridiiatioa signal on chromosomal legion 13ql32.3, the chromosomal region harboring the Apo E gene, was selected 
for finding biailalic mariners in linkage disequilibrium with the Apo E gene as follows. 

This BAC contained an insert of 205 kb that was subcluncd as previously described. Fifty DAC subclones were 
15 randomly selected and sequenced. Twenty five subclone sequences were selected and used to design twenty five pairs 
of PGR primers allowing 500 bp-amplicons to be generated. These PGR primers were then used to amplify the 
corresponding genomic sequences in a poof of DMA from 100 unrelated individuals (blood donors of French origin) as 
already described. 

Amplification products from pooled DMA were sequenced and analyzed for the presence of hiallolic 
20 polymorphisms, as already described. Five arDpIicons were shown to contain a poiymorplifc base in the poof of TOD 

unrctated individuals, and therefore these polymorphisms were selected as random bialleiic markers in the vicinity of the 
Apo E gene. The sequences of both alleles of these biailclic markers (99-3441439 ; 99-355/219 ; 99-353i30B ; 99- 
365^344; 99-365/274) correspond to SEQ 10 Ncs: 301-305 and 307-311 (Sec the accompanying Sequence listing and 
Tablet 0) . Correspondino pairs of ampltficalion primers for generating ampEcons ccntaming these bialldic markers can 
25 be chosen from those listed as SEQ ID Nos: 313-317 and 319-323. 

An additional pair of ptimers (SEQ ID Nor. 318 and 324) was designed that allows amplification of the 
genomic fragment carrying the biallelic polymorphism corresponding lo the ApoE marker (99-2452/54; C/T; The C allele 
is destgnated SEQ 10 NO: 308 in the accompanying sequence listing, while tlic T allele is designated SEQ ID NO: 312 in 
tha accompanying Sequence listing; (See also Table 101, publicly known as Apo £ site A (Weisgraber et al. (1981), 
30 suprs: flail et al. (1982), saprd) to be amplified. 

The five random biallelic markers plus the Apo E site A marker ware physically ordered by ?CR screening of the 
corresponding ampScons using all available BACs originally selected from the genomic DNA libraries, es previously 
described, using the public Apo E marker stSG94, The amplicon's order derived from this BAC screening is as follows: 
(99'344f99-36B) - (99-365/99-2452) * 99-359 • 99-355, 
35 where brackets indicate that the exact order of the respective amplicons couldn't be established. 

Linkage disequilibrium among the six biallelic markers (five random markers plus the Apo E site A) was 
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detarmined by genotyping the same 100 unrelated individuals from whom the random hiallclic markers were identified. 

DNA samples and amplification products from genomic PCR were obtained in similar conditions os those 
described above for the generation of brallelic markers, and subjected la nirtomated microscquencing reactions using 
fluorescent ddNTPs (spscific fluorescence for each ddNTP) and ths appropriate microsequencing primers having a 3* end 
5 rmmediately upstream of llic polymorphic base in the bialfefic markers. The soquenca of those mlcrosoqucncino primers is 
indicated witliin the conosponding sequence listings of SEQ ID Nos: 325-330. Once specificaBy extended at tha 3' end 
by 3 DNA polymerase us?ng tha complemontary fluorsscent didcoiynuclaotide analog (thermal cycling), the 
microsequoncing primer was procipitylcd to remove the unincorporated fluorescent ddNTPs. The reaction products wuni 
analyzed by electrophoresis on ABI 377 suqusncing machines. Results were automatically analyzed by an appropriate 

10 software further described in Example 13. 

Linkage disequilibrium (ID) between all pab of biallelic markers {Mi, Mjl was calculated for every allofe 
combination (Mi1,Mj1 ; Mi1,Mj2; Mi2,Mjl ; Mi2,Mj21 according to the maximum likelihood estimate {MLE) for dclla (the 
composite linkage disequilibrium coefficient). The results of tha LD analysis between the Apa E Site A markar and the 
fivQ new biailclic markers I99-344/439 ; 99-355/219 ; 99-359/308 ; 99 305/344 ; 99.366/274) are summariied in Table 

15 2 below: 

Tnhlg 2 



Marksrs d x tOO SEQ ID Nc5 of tho SEQ ID Wqs of the 

biallotic Markers amplification Primers 

It 

ApoESiloA 308 318 

99-2452/54 312 324 

99-344/439 \ 301 313~ 

25 307 319 

g9^3G6/274 I 305 317 

311 323 
93-3551344 8 304 318 

310 322 

30 99-3591308 2 303 315 

309 321 
93-355/219 1 302 3H 

308 320 



35 



The above LD results indicate that among the five biallelic markers randomly selected in a region of about 200 
kb containing the Apo E gene, marker 99-365/344T is in relatively strong linkage disequilibrium with the Apo £ site A 
allele {93-2452/540. 
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Therefore, since the Apo E site A allele is associated with Alzheimer's disease, one can predict that tlie T allele 
of marker 89.365/344 will probably be found sssociaied with AD, In order to test this hypothesis, the bialleiic markers 
of S£0 ID Nos : 301-306 and 307-312 were used in association studies as described below. 

225 Alzheimer's disease patients were recruited according to clinical inclusion criteria based on the MMSC 
lest. Ti»e 248 control cases included in this study were ioth ethnicGily- and ouc-rnatched to the affected cnsns. Both 
affected and control individuals corresponded to unrelated cases. The identities of the polymorphic bases of each of the 
biallclic markers was determined in each of these individuals using the methods described above. Techniques for 
conducting association studies are further described bulow. 

The resuhs of this study are summarized in Table 3 below : 



Tablo 3 



MARKER 



ASSOCIATIOiyOATA 



15 



20 



Difference in allele frequency 
between indh/iduals with Alzheimer's 
and control individuals 



99-344/439 
99-366/274 
99-305/344 
59-2452/54 <ApoESiU A) 
99-359/308 
99.355/219 



3.3% 
1.6X 
17.7% 
23.8% 
04% 
2.5% 



Corresponding p-vaiua 



9.54 E-02 
2.09 E-OI 
6.9 £-10 
3.95 £-21 
D.2E-0r 
2.54 E-Ol 



25 The frequency of the Apo E site A allele in both AD cases and controls was found in agreement with that 

previously reported (ca, 10% in controls and ca. 34% in AD cases, leading to a 24% difference in eilele frequency), thus 
validating the Apo E e4 association in the populations used for this study, 

Moreover, as predicted from the LD analysis (Table 2), s significant association of the T slide ef marker 99- 
365/344 with AD cases (18% increase in the T allele frequency in AD cases compared to controls, p value for this 
30 difference - 6.9 E-10) was observed. 

The above results indicate that any marker in LD with one given marker associated with a trait will be 
associated with the trait It will be appreciated that, though in this case the ApoE Site A marker is the trait-causing 
allele (TCA) itself, the same conclusion could be drawn with any other non TCA marker associated with the studied trait* 
These results further indicate that conducting association studies with a set of bialleiic markers randomly 
35 generated within a candidate region at a suffideni density (here about one bialleiic marker every 40kb on average), 
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allows the identificDtion of st least one marker associated with the trait 

In addition, these results correlata with the physical order of the six biallelic markers contemplated within ihe 
present cxampla (sec above) ; marker 99-365/344, which had bcun found to be the closest in terms of physical distance 
to the ApcE Site A marker, also shows the strongest ID with the Apo E site A marker. 
5 In order to further refine the reialionship between physical distance and Hnka(je disequilibrium between hiallclic 

markers, a ca. 450 kh fragment from a genomic region on cbromosoma 8 was fully sequancod, 

LD wilWn CD. 230 pairs of biallelic markers derived llicrefrom was measured in a random French popufation 
and analyzed as a function of the known physical intcr-marker spacing. This analysis confirmed that, on averaoe, 10 
between 2 biallelic markers corrolates with the physical distance that separates them. It further indicated that LD 
10 between 2 biallelic markers tends to decrease when their spacing increases. More particularly, ID between 2 biallelic 
markers lends to decrease when their inter-marker distance is grcataf than 50kb, and is further decreased when the 
inter-marker distance is greater than 75kb. It was further observed that when 2 biallelic markers wore further than 
ISOkb apart, most often no significant LD between them could be evidenced. It will be appreciated that the size and 
history of the sample population used to measure 10 between markers may influence the distance beyond which LD 
15 lends not to be detectable. 

Assuming that LD can be measured between markers spanning regions up to an average of 150kb long, biallelic 
marker maps will allow genome-wide LD mapping, provided they have an average inlcr-marker distance lower than 



Genome-wide LD mapping aims at identifying, for any TCA being saarched, at least one biallelic marker in LO 
with said TCA. PreferaWy. in order to enhance the power of LD maps, in some embodiments, the biafielic markers therein 
have average inter-marker distances of 150kb or less, 75 kb or less, or 50 kb or less, SOkb or less, or 25kb or less to 
accommodate the fact that, in some regions of the gcnoms. the deieciion of LO requires lower inter-marker distances. 

The present invention provides methods to generate biallelic marker maps with average mter-marksr distances 
of ISOkb or less. In some embodiments, the mean distance between biallelic markers constituting the hioh density map 
will be less than 75kh, preferably less than 50kb. Further preferred maps according to the present invention contain 
markers that are less than 37,5kb apart In highly preferred embodiments, the average inter-marker spacing for the 
bialleSc markers constituting very high density maps is lass than 30kb, most preferably less than 25kb. 

Genetic maps containing biallelic markers (including the 653 biallelic markers obtained above, which include the 
sequences ol SEQ ID Nos. 1-SO and 51-100 or the sequences complementarY thereto) may be used to identify and 
isolate genes associated with detectable trails. The use of the genetic maps of the present invention is described in 
more detail below. 



One embodiment of the present invention comprises methods for identifying and isolating genes associated 
with a detectable trait using the biallelic marker maps of the present invention. 

In the past, the identification of genes raiked with detectable traits has relied on a statistical approach called 



15Dkb, 



Use of the High Density Biallelic Marker Mao to Identify 
Genes Associated with a Detectable Trait 
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linkage analysis. Linkage analysis is basad upon establishing a correlation between the transmission of genelic markers 
and that of a spacific trait throughoul generations within a family, fn this approach, ail members of a series of affected 
families are genotypcd with a few hundred markers, typically microsatcltitc markers, which are distributed at an average 
density of one every 10 Mb, By comparing genotypes in on family members, one can attribute sels of alleles to parental 
5 haploid genomes (haplotyprng or phase determination). The origin of rccgmbined fragments is then dotarmined in the 

offspring of all famiUes, Those that co-segregate with the trait are tracked. After poofing data from at! families, 
statistical methods are used to determine the likelihood that the marker and the trail are segregating independently in all 
families. As a result of the statistical analysis, one or several regions having a high probability of harboring a gene linked 
to the trail arc selected as candidates for furltmr analysis. The result cf linkage analysis is considered as significant (i.e. 

10 there is a high probability that the region contains a gene involved in a detectaWc trait) when the chance of independent 
segregation of the merker and the trait is fywer than 1 in 1000 (expressed as a LOD score > 3). Guncrally. the length 
of the candidate region identified lising linkage analysis is between 2 and 20Mb. 

Once 2 candidate region is identified as described above, analysis of recombinant individuals using additional 
markers allows further delineation of the candidate linked region, 

^5 Linkage analysis studies have generally relied on the use of a maximum of 5,000 microsatellite markers, ilms 

limiting tlic maximum theoretical attainable resolution of linkage analysis to ca. 6Q0 kb on average. 

Linkage analysis has been successfully applied to map simple genetic traits that show cloar Mcndclian 
inheritance patterns and which have a high penetrance (penetrance is the ratio between the number of trait positive 
carriers of allele 3 and the total number of a carriers in the population). About 100 pathological trait-causing genes were 

20 discovered using linkage analysis over the last 10 years. In most cf these cases, the majority of affected individuals had 

affected relatives and the detectable trait was rare in the general population (irequendeslcss than C.1%). In about 10 
cases, such as Abheimcr's Disease, breast cancer, and Type li diabetes, the detactabte trait was more common but the 
aBelc associated with the detectable trait was rare in the affected populatioa Thus, the alleles associated with these 
traits were not responsible for the trait in all sporadic cases. 

25 Linkage analysis suffers from a vsriety of drawbacks. First, linkage analysis is limited by its reliance on the 

choice of a genetic model suitable for each studied trait. Furthermore, as already mentioned, the resolution attainable 
using linkage analysis is limited, and tompiementary studies are required to refine the analysis of the typical 2Mb to 
20Mb rogions initially identified through linkage analysis. 

In addition, linkage analysis approaches have proven difficijit when applied to complex genetic traits, such as 

30 those due to the combined action of multiple genes and/or environmental factors. In such case:, too large an effort and 
cost are needed to recruit the adequate number of affected families required for applying linkage analysis to thcso 
situations, as rewmUy discussed by Risch, N. and Mcrikangas, K. {Sciance 273:1516-1517 (19961, the disclosure of 
which is incorporated herein by reference). 

Finally, linkage analysis cannot be applied to the study of traits for which no large informative families are 

35 availahte. Typically, this will be the case in any attempt to identify trait^causing alleles involved in sporadic cases, such 
as alleles associated with positive or negative responses to drug treatment* 



wo 99/04038 





PCT/IB98/01193 



40- 



The present genelic maps end bialleGc markers (including the 653 biallsfic markers obtained above, which 
include the sequences of SEQ ID Nos, 1-50 and 5MD0 or the sequences complcmGntary thereto) may be used to 
identify and isolats genes associated with detectable traits using association studies, an approach which does not 
require the use of affected families and which permits tiie identification of genes associated with sporadic traits. 

Association stutHcs m described ia more detail below. 



As already mentionad, any gene responsible or partly responsible for a given trait wtl( be in LD with some 
fianlcino markers. To map such a gene, specific alleles of these flanking n^arkcrs which are associated with the aene or 
genBS responsible for the trait are idcnlifted* Although the fnllowirg discussion of techniques for finding the gano or 
genes associatad with a particular trait using linkage disequilibrium mapping, refers to locating a single gene which is 
responsible for the trait it will be appreciated that the same techniques may also be used to identify genes which are 
partially responsible for the trait, 

Association studies may be conducted within the genural population (as opposed to the linkage analysis 
techniques discussed abovo which are Emitcd ta studies performed on related individuals in one or several affected 
families). 

Association between a biallelic marker, A and a trait T may primarily occur as 3 result of three possible 
relationships between the biallelic marker and the trait. 

First allele 3 of faiaHelic marker A may be directly responsible for trait T lag., Apo E c4 site A and Alzheimer's 
disease]. However, since the majority of the biallelic markers used in genelic mapping studies are selected randomly, 
they mainly map outside of genes, ThuSi the likelihood of alfcic j being a functional mutation directly related to trait T Is 
very low, ^ 

Second, an association between a biallelic marker A and a trail T mny also occur when the biallelic marker is 
very closely linked to the trait locus. Fn other words, an association occurs when allele a is in linkage disequilibrium with 
the trait'Causing aBele, Whan the biallelic marker is in close proximity to a gene responsible for the trait more extensive 
genelic mapping will ultimately altow a gene to be discovered near the marker locus which carries mutations in people 
with trait T (i.e, the gene responsible for the trait or one of the genes responsible for the trait). As will be further 
exemplified below, tising b group of bialleSc markers which are in close proximity to the gene responsible for the trait the 
location of the causal gene can be deduced from the profila of the association curve between the biallelic markers and 
the trait. The causal gene will usually be found in the vicinity of the marker showing the highest association with the 
trait 

Hnally, an association between a biallelic marker and b trait may occur when people with the trait and people 
without the trait correspond to genetically different subsets of the population who, coincidentaliy, also differ in the 
frequency of allele 3 (population stratification). This phenomenon may be avoided by using ethnically matched large 
heterogeneous samples. 

Association studies are particularly suited to the efficient identification of genes that present common 
polymorphisms, and are involved in multifactorial traits whose fretiuency is relatively higher than that of diseases with 



Asyeciation Studies 
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monofsctonal inhoritance. 

Association studies mainly consist of four steps: recruitment of tralt-positivB fT+) and trait-neg alive (T-) 
populations with well-dcfincd phsnotypes, fdcntificntion of s candidate rcQion suspecied of harboring a trail causing 
gene, identification of said Qcnc among candidate genns in tfie region, and finally validation of mutationfs) responsible for 
the trait in said trait causing gene. 

In a first step, lriiit+ ond trait - phenotypcs have to be woil-defincd. In order to perform efficient and 
significant association studies such as liiose described herein, the trait under study should preferably follow a bimodal 
distribution in the population understudy, presenting two clear non-overlapping phcnotypos, trait + and trait 

Neveaheless, in the absence of such a bimodsl distribuiion (as may in fact he the case for complex gnnctic 
trails), any genetic trail may still be analyzed using the association maihod proposed herein by carefully selecting the 
individuals to bo included in tite trait + and trait - phenotypic groups. The selection procedure invofves selcctiny 
individuals at opposite ends of the non-himodal plienotypc spectrum of the trait under study, so as to include in these 
trait + and trait - populations individuals who clearly represent non-ovcrlapping, preferably eitremi; phenotypas. 

The definition of the inclusion criteria for the trait + and trait - populations is an important aspect of the 
present inventiorL The selection of those drastically different but relatively uniform phenolypes enables efficient 
comparisons in association studies and the possible detection of marked differences at the genetic level, provided that 
the sample sizes of the populations under study are significant enough. 

Generally, trait + and trait - populations to be indudad in association studies such as those proposed in the 
present invention consist of phenotypically homogeneous populations of individuals each representing 100% of the 
corresponding phonotypc if the trait distribution is bimodal. If the (rait distribution is non-faimodal trait + and trail - 
populations consist of phenotypically uniform populations of individuals rcprcsentinj^ sach between 1 and 98%, 
preferably between 1 and 80%, more preferably between 1 and 50%, and more preferably between 1 and 30%, most 
preferably between 1 and 20% of the total population under study, and selected among individuals exhibiting non- 
Qveriapping phenotypes. In some embodiments, the T' end T groups consist of individuals exhibiting the extreme 
phenotypes within the studied population. The clearer the difference between the two trait phenoiypes, the greater the 
probability of detecting an association with biallelic markers. 

In preferred embodiments, a first group of between 50 and 300 trait + individuals, preferably about 100 
individuals, are recruited according to their phenotypes. In each case, a similar number of trait negative individuals are 
included in such studies who are preferably both ethnically* and age-matched to the trail positive cases. Both trail -^ and 
trait - individuals should correspond to unrelated cases, 

Rgure 3 shows, for a series of hypothetical sample sizes, the p-value significance obtairied in association 
studies performed using individual markers from the high-density biallelic map. according to various hypotheses regarding 
the difference of allelic frequencies between the T+ and T- samples. It indicates that, in all cases, samples ranging from 
150 to 500 individuals are numerous enough to achieve statistical significance. It will be appreciated that bigger or 
smaller groups can be used to perform association studies according to the methods of the present invention. 

In a second step, a marker/trait association study is performed that compares the genotypa frequency of each 
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biaDelic marker in the abova dascribod T+ and T- populations by msans of a chi square statistical test (one degrsB of 
freedom). In addition to this single marker association analysis, a haplctype association analysis is pcrforn^ed to define 
the frequency and the type of the ancestral carrier haplotype. HaplotypB analysis, by combining the inf ormativfincss of a 
set of biallelic markers incrcasus tlic power of the association analysis, allowing false positive and/or ncgntivc data that 



Genotyplng can be performed using the microsequencing procedure described in Example 13, or any other 
gsnotyping procedure suitable for this intended purpose. 

If a positive association with a trait is identified using an array of blalloKc markers having a hiyh cnoufih 
density, the causal gene will be physically located in tlie vicinity of the associated markers, since tlic markers showing 
10 positiva association with the trait are in linkage disequilibrium with the trait locus. Regions harboring a ganc responsible 

for a particular trait which are identified through association studies using high density sots of hiallclic markers will, on 
average, be 20 • 40 times shorter in length than those identified by linkage analysis. 

Once a positWc association is confirmed as described above, a third step consists of completely sequencing the 
BAG inserts harboring the markers identiHcd in the association analyzes. Thssc BACs sre obtained through screening 
15 human genomic libraries with the markers probes and/or primers, as described above. Once a candidate runion has been 

sequenced and analysed, the functional sequences within the candidate region (e.g, exons. spRce sites, promoters, and 
other potential regulatory regions) are scanned for mutations which are responsible for the trait Ijy comparing the 
sequences of the functional regions in a selected number of T+ and T- individuals using appropriate software. Tools for 
sequence analysis are further described in Example 14, 



Finally, cai^didate mutations arc then validated by screening a larger population of T+ and 
T- individuals using genotyping techniques described below, Polynwr^hisms arc confirmed as 
candidate mutations when the validation population sliows association results compatible with those 
found between the mutation and the trait in the test population. 

In practice, in order to define a region bearing a candidate gene, the trail + and trait • populations are 
genctypad using an appropriate number of bialfelic markers. The markers may include one or more of the 653 markers 
obtained above (which include the sequences of SEQ ID Nos: 1-50 and 5MD0 or the sequences complementary thereto. 

The markers used to define a rojilon bearing a candidate gene may be distributed at an average density of 1 
marker per 10-200 kh. Prefarahly, the markers used to define a region bearing a candidate gene arc distributed at an 
average density of 1 marker every 15-150 kb. In further preferred embodiments, the markers used to define a region 
bearing a candidate gene are distributed at an average density of 1 marker every 20-1 OOkb, In yet another preferred 
embodiment, the markers used to define a region bearing a candidata gene are distributed at an average density of 1 
marker evsry 100 to ISOkh. In a further highly preferred embodiment, the markers used to define a region bearing a 
canfidate gene are distributed at an average density of 1 marker every 50 to lOOkh. in yet another embodiment, the 
biallelic markers used to define a region bearing a candidate gene are distributed at an averaga density of 1 marker every 
25-50 kilobases. As mentioned abov8i In order to enhance the power of linkage disequilibrium based maps, in a preferred 



5 



may result from the single marker studios to be eliminated. 
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emhodiment, the marker density of tha map will be adapted to take the Imkage disequilifafium distribution in the genomic 
region of interost into account. 

In some Bmbodimonts, tha initial identification of a candidate oenomic region harboring a gene associated with 
a detectable phenotype may be conducted using a pfeliminiiry map containing a few thousand biailalic markers, 
5 Tlicroafter. the genomic region harboring the gene responsible for tiie detectable trait may bu better dafinoated using a 

map containing a larger number of biaBelic markers. Furthermore, tha genomic fugion harboring tha gene responsible for 
the detectable trait may bo further dcBnoated using a high density map of biallolic markers. Finally, the gene associated 
with the detectable trait may be identified and isolated using a very high density bialluiic marker map. 

Example It describes a hypothetical procedure for identifying a candidate region harboring a gene associated 

10 with a detectable Irait. It will be appreciated that although Example 11 compares the results of analyzes using markers 
derived from maps having 3,000. 20,000, and 60,000 markers, the number of markers contained in the map is not 
restricted to these exemplary figures. Rathsr, Example 1 1 exemplifies the increasing refinement of the candidate renion 
with increasing marker density. As increasing numbers of markers ore used in the analysis, points in the association 
analysis become broad peaks. The gene associated with the dfltectable trait under investigation will lie within or near 

1 5 the region under the peak. 

Example 1 1 

Identification of a Candidate Rt^ninn Knrbortng a 
Gene Associated with a Detectable Trait 
The initial identification of a candidate genomic region harboring a gene asscciaied with a detectable trait may 
20 he conducted using a genome-wide map comprising about 20,000 hiallelic markers. The candidate genomic lugiun may 

be further defined using a map having a higher marker density, such as a map compiising^about 40,000 markers, about 
60.000 markers, about 80,000 markers, about 100,000 markers, or about 120,000 markers. 

The use of high density maps such as those describad above allows the identification of genes which arc truly 
associated with detectable traits, since tha coincidental associations will ba randomly distributed along the genoma 
25 while the true associations will map within one or more discrete genomic regions. Accordingly, hiallelic markers located 
in the vicinity of a gene associated with a detectable trait win give rise to broad peaks in graphs plotting the frequencies 
of the hiallelic markers in T+ individuals versus T- individuals. In contrast, biallcGc markers which arc not in the vicinity 
of the gene associated with the detectable trait wiO produce unlqua points in such a plot* By determining the 
association of several markers within the region containing tl« gene associated with tha dQteciablc trait, the gene 
30 associated with the detectable trait can be identified using an association curve which reflects the difference between 
the allele frequencies within the T+ end T- populations for each studied marker. The gene associated with the 
detectable trait will ba found in the vicinity of the marker showing the highest association with the trait. 

Figures 4, 5, and 6 illustrate the above principles. As illustrated in Figure 4, an association analysis conducted 
with a map comprising about 3,000 biaflelic markers yields a group of points. However, when an association analysis is 
35 performed using a denser map which inchides additional bialielic markers, the points become broad peaks indicative of 
the location of a gene associated with a detectable trait For example, the bialielic markers used in the initial association 
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analysis may be obtained from a map comprising about 20,000 faialldic markers, as illustrated in Figure 5, In some 
embodiments, one or more of the 653 bialMic markers obtained above {which include the sequences of SEQ ID Nos, 1-50 
and 5M00 or the sequences complementary thsrato) are used in the association analysis. 

In the hypothetical eiompls of Figure 4, ths associaiion analysis with 3i000 markers suggests pcnks near 
markers 9 and 17. 

Next, a second analysis is performed using additional markers in the vicinity of markers 9 anil 1 7, as ilfustrmd 
in the hypothetical example of Figure 5, using a map of about 20.000 markers. This step again indicates an iissociation 
in tlic close vicinity of marker 17, since more markers tn this rfigiun show an association with the trait. However, none 
of the additional markers around marker 9 shows a signilicanr association with the trait, which makes marker 9 a 
potential false positive. In soma embodiments, one or more of llic 653 hiallclic markers obtained above (which include 
the sequences of SEQ 10 Nos, 1-SO and 5M0Q or the sequences complementary thereto) are used m the second 
analysis, in order lo further test the validity oi these two suspected associations, a third analysis may be obtained with 
a map comprising about 00,000 biaflatlc markers. In soma embodiments, one or more ot the 653 biallcHc markers 
obtained above are used in the iliird association analysis. In the hypothetical example of Figure C, more markers lying 
around marker 17 exhibit a high degree of association with the detectable trait. Conversely, no association is confirmed 
in the vicinity of marker 9. The genomic region surrounding marker 17 can thus be considered a candidate region for the 
hypothetical trait of this simulation. 

Ths statistical power cf ID mapping using a high density marker map is also reinforced by complemnnting the 
single point association analysis described above with a multi-marker association analysis, called haplotype analysis. 

When a chromoscme carrying a disease allele is first introduced into a population as a result of either mutation 
or migration, tl« mutant allele necessarily resides on a chromosome having a unique set of/mked markers: the ancestral 
haplotype. As already mentioned, a haplotype association anaiysis allows the frequency and the type of the ancestral 
carrier haplotype to be defined. 

A haplotype analysis is performed by estimating the frequencies of all possible haplptypes for a given set of 
hiallelic markers in the T* and T- populations, and comparing these frequencies by means of a chi square statistical test 
(one degree of freedom). Haplotype estimations are usually performed by applying the Expectation-Maximization (EM) 
algorithm (Excoffier L and Slatkin M, Mol Biol. EvoL 1Z-92V927 (1995). the disclosure of which is incorporated herein 
by reference), using the EM-HAPLO program (Hawley ME, Pakstis AJ & Kidd KK, Am, 1 Phys. AnthropoL 1 8:1 04 
(1334), the disclosure of which is incorporated herein by reference). The EM algorithm is used to estimate haplotype 
frequencies in the case when only genotype data from unrelated individuals are available. The EM algorithm is a 
generalized iterative maximum likelihood approach to estimation that is useful when data are ambiguous and/or 
incomplete. 

To improve the statistical power of the individual marker association analyses conducted as described above 
using maps of increasing marker densities, haplotype studies can be performed using groups of markers located in 
proximity to one another within regions of the genome. For example, using the methods described above in which the 
association of an individual marker with a detectable phenoiype was analyzed using maps of 3,000 markers, 20,000 
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markBrs, and 6O,0OD markers, a series of haplatype studies can be pErfcrmed using groups of contiguous markers from 
such maps or from maps having higher marker densities. 

In a preferred embodiment, a scries of successive h.ip!otype studies mcluding groups of markers spanning 
regions of more than 1 Mb may he performed. In some embodiments, ttiq bialtelic markers included in ench of these 
flrcups may he located within a genomic region spanning less than Ikb. from 1 to 5kb, from 5 to IQkb, from 10 to 25kb, 
from 25 to 50kb, from 50 to ISOkb, from 150 lo 2bm from 250 to SQOkb. from 5Q0kb to 1Mb, or more thnn 1Mb. 
Preferably, ific genomic regions containing the groups of biallelic markars used In the successive haplolype analyses arc 
overlapping. It will be appreciated that the groups of bidfelic markers need not complstely cover the genomic regions of the 
above-specified lengths but nwy instead be obtained from incumplcle contigs having one or more gaps ificrein. As discussed 
in further detail bclow^, biaMc markers may be used in single point and hoplotype association analyses myiirdlcss of the 
campletcncss of the corresponding physical contig harboring them. 

Without wishing to be Rmited ts any particulor numerical value, it is faelfcved that those haplotypes dispbyiitg a 
coefficient of relative risk shove 1. preferably about 5 or more, preferably of about 7 or more arc indicative of a 
•significant risk' for the individuals carrying the identified haplotypc t£5 develop the given trait. However, it is difficult to 
evaluate accurstely quantified boundaries for the so-caUed 'significant risk*. Indeed, and as it has been demonstrated 
previously, several traits observed in a given population are multifactorial in that they are not only the result of a single 
genetic predisposilion but also of other factors sucli as environmental factors. Thus, the evaluation of a significant risk 
must take these parameters into consideration in order to, iji a certain manner, weigh the potential importance of 
external parameters in the development of a given trait. Thus, tho relative risk which constitutes a "significant risk' lo 
develop a given trait is evaluated differently depending on the trait under consideration and the populations tested. 

Genome wide mapping using association studies with dense enough arrays of markers permit a case by-case 
best estimate of p-value significance thresholds. Given a test population comprising two ethnically matched trait 
positive and trait negative groups of about 50 to about 500 individuals or more, conducting the above described 
association studios will allow a p-value "cut-off to be established by, for example, analyzing significant nur^bors of 
allele frequency differences or, in some cases where appropriate, running computer simulations or control studies as 
described in Examples 11, 20, and 31. 

For a p-valuB above the threshold, a corresponding association between the trait and a studied marker will be 
deemed not significant, while for a p-valua below such a threshold, said association will be deemed significant. If the p- 
value is significant, tha genomic region arround ths marker will be f unhsr scruuni2cd for a trait-causing gene. 

It is preferred that p-vaiua significance thresholds he assessed for each caselcontrg! population comparison. 
Both the genetic distance between sampled population-'stratification'-and the dispersion due to random selection of 
samples may indeed influence the p-valuc significance thresholds. 

It will be appreciated that the above approaches may be conducted on any scale (i.e. over the whole genome, a 
set of chromosomes, a single chromosome, a particular subchromosomal region, cr any other desired portion of the 
genome). As mentioned above, once significance thresholds have been assessed, population sample sizes may be 
adapted as exemplified in Figure 3. 
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Examplfl 12 below illustratss ths increase in statistical power brought to an association study by a haplotypc 

analysis. 

Example 12 

Haplotvoe Analv.sK: Irfontificatipn of hinliclic markers riolinentin i; 
a nnnomic reofon assflgiatod with Alzhaimer s Disf;,isB lADl 

A3 shown in Table 3 within Example 10, at an averaga map density cJ one marker pur 40 kb only one marker 
(99-365/344 ) out of fivs random biallolic xnarkers from a ca. 2U0 kb genomic region around the Apo E {jcns showed a 
clsar association to AD (delta allelic frequency in cases and controls - 1 8% ; p value - 6.9 E-10h The allcKc ffequanaes 
of the other four random markers wsra not significantly different between AD cases and controls ip-values ^ E»01). 
However, since linkaga disequillbtiuin can usually be detected between markers located further apart than an average 40 
kb as previously discussed, one should expect tliat, performing an association study wtlh a local excerpt of a biallelic 
marker map covering ca. 20Qkb with an average inier-markcr distance of ca, 4Qki; should allow the identification of 
mora than one biaOcIic marlcer associated with AD. 

A haplotypa analysis was thus performed using the biallafic marlcors 99-344/439; 99-355/219; 99-359/308 ; 
99-365/344 ; and 99.366/274 (of SEO ID Nos: 301-305 and 3Q7-3nK 

In a first step, marker 99-355/344 that was already found associated with AD was not included in the 
liaplatypc study. Only biaHclic markers 99-344/439 : 99-355/219 ; 99-353/3D8 ; and 39.355/274, which did not show 
any significant association with AD when taken individually, were used. This first haptotype analysis measured 
frequencies of all possible two-, three*, or four-marker haplotypes in the AD case and- control populations. As shown in 
Rgure 7, there was one haplotyps among all the potential different haplotypes based on the four individually non- 
significant markers ("haplotype 8% TAGG comprising SEQ ID No, 305 which is the T allele of marker 93-365/274, SEQ 
ID No. 301 which is the A allele of marker 99*344/439, SED ID No. 303 which is the G allele of marker 99-359/308 and 
SEQ 10 Wo, 302 which is the G allele of marker 99-355/2191 that was present at statistically significant different 
frequencies in the AD case and control papulations (A- 12% ; p value - Z05 E-06). Moreover, a significant difference 
was already observed for b three-marker haplotype included in the above mentioned ''haplotype 8" Chaptotype T, TGG, 
A- 10% : p value - 4J6 E-05), Haplotyps 7 comprises SEQ ID No. 305 which is the T allele of marker 99-366/274, 
SEQ ID No. 303 which is the G allele of marker 99.352;308 gnd SEQ 10 No. 302 which is the G allele of marker 93- 
355/213), The hapfotype association analysis thus clearly increased the statistical power of tha individual marker 
association studies hy more than four orders of magmtude when compared to single-marker analysis (from p values ^ E- 
01 for the individual markers - sec Table 3 • to p valua ^ 2 E 05 for tha four-marker "haplotype 8T 

The signiHcance of tha values obtained for this haplotype association analysis was evaluated by the foHDwing 
computer simulation. The genotype data from the AO cases and the unaffected controls were pooled and randomly 
allocated to two grnups which contalnad the same number of individuals as the case/control groups used to produce the 
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data summarired in Figure 7. A four-marker haplotype analysis (99-344/439; 39-355/219; 99*359/308; and 39- 
366/274) W3S run on these artificisl ofoups. This experiment was reiterated 100 times and tho results arc shown in 
Figure 8, No haplotype among those generated was found for which the p-valuc of the frequency dilferciicc between 
both populations was mora significant than 1 E-05. In addition, only 4% of the generated haplotypes showed p-valucs 
5 lower than 1 E-04. Since both thesa p-vaiuc thresholds are less siQnificant than lha 2 E-06 p-valua showed hy 

*haplotype 8*, this haplotype can be considered significantly associated with AD. 

In a second step, marker 99-365/344 was included in ttiu haplotypo analyzes. The frequency differences 
botwccn the affected and non affected populations was calculated for all two-, three-, four- or tive-marScor haplotypes 
involving markers: 99-344/433; 99-355/219; 99-359/308; 99-360/274; and 99-305/344. The most significant p- 

10 values obtained in each category of haplotype (invdving two, three, four or five markers) were examined depending on 

which markers were invoh^ed or not within the haplotype. This sliowcd that all haplotypes which included marker 39- 
365/344 showed 2 significant association with AO Ip-values in the range o( E-04 to E-l 1). 

An additional way of evaluating the signtlicance of the values obtained in the haplotype association analysis 
was to perform 3 similar AD case-control study on biallelic markers generated from BACs containing inserts 

15 corresponding to genomic regions derived from chromosomes 13 or 21 and not known to be involved in Alzheimer's 

disease. Performing similar haplotype and individual association analyzes as those described above and in Example 10 
did not generate any significant association results (ail p-values for haplotype analyzes were less significant than E-03; 
all p-valucs far single marker association studies were less significant than E-021. 

The results described in Examples 10 and 12, generated from individual and haplotype studies using a hialtcHc 

20 marker set of an average density equal to ca. 40kb in the region of an Alzheimer's disease trait causing gene, indicate 

that all biallelic markers ol sufficient infarmatrva content located within a ca, 200 kb genomic region around a TCA can 
potentially be succesfuOy used to localize a trait causing gene with the methods provided by the present invention. This 
conclusion is further supported by the results obtained through measuring the linkage disequilibrium between markers 
93-3B5/344 or 93-353/308 and ApoE 4 Site A marker within Alzheimer's patients: as one could predict since LD is the 

25 supporting basis for association studies, LD between these pairs of maikcrs was enhanced in the tfiseased population vs. 

tha control population. In a similar way as the haplotype analysis enhanced the significance of the corresponding 
association studies. 

Once a given polymorphic sita has been found and characterized as a biallelic marker according to the methods 
of tha present invention, several methods can be used in order to determine the specific allele carried by an individual at 
30 the given polymorphic base. 

In some embodiments, gcnotyping will be applied to one or more of the markers of SEQ ID Nos: 301-305 and 
307-311 or the sequences complementary thereto. In additional embodiments, genotyping will be applied 10 the markers 
of SEQ ID Nos. 3QS and 312 as well as one or more of tha markers of SEQ ID Nos, 301-305 and 307-311. In some 
embodiments, genotyping will he applied to one or more of the 653 biallelic markers obtained above (which include the 
35 sequences of SEQ ID Nos. 1-50 and 5M00 or the sequences complementary thereto). The present invention further 
contemplates tha genotyping of any biallelic marker within the provided maps, including those that are in linkage 
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disequilibrium with the 653 brallclic markars obtained above (which include the sequences of SEQ ID N05. 1-50 and 51- 
100 Of the sequences complfimentary thereto) or the markers of SEQ ID Nos. 301-312 or the sequences complementary 
thereto. 

Most genotyping nwthods require tho previous amplification of a DMA rt^gion carrying the polymorphic sito of 

interest. 

The identification of biaflolic markers described previously, allows iha design of appropriato oligonucleotides, 
which can be used as primers to ampSfy a DMA fragment containing tlie polymorphic site of interest and for the 
detection of such polymorphisms. 

In particularly preferred flmhodimcnts, pairs of primers of SED ID Nos: 313-318 and 313-324 may he used to 
Ocnerate amplicons harborino the markers of SED ID Ncs: 301 -306/307-31 2 or the sequences complcnicnlary thereto. In 
further embodiments, pairs of amplification primers may ba used to generate amplicons harboring the 653 markers 
obtained above (which include the sequences of SEQ ID Nos. 1-50 and 5M00 or tlic scguonces ccmptementary thereto. 
In highly preferred embodiments, pairs of the amplification primers of SEQ ID fJos: 10I-15Q and 1U1-200 r^ay Lti used 
to generate amplicons harboring the markers of SEQ ID Ncs: 1-50 and 5M00 or the sequences complementary thereto. 

It will be appreciated that amplification primers may be designed having any length suitable for their intended 
purpose, in particular any length allowing their hybridization with a region of the DMA fragment to he amplified. 

It will be further appreciated that tho hybrtdizalion site of said omplificstion primers may be located at any 
distance from the polymorphic basa to be genotyped, provided said amplification primers allow the proper amplification 
of a DMA fragment cnrrying said polymorphic site. The ampiilication primers may be oligonucleotides of 10, 15, 20 or 
more bases in length which enable the amplification of tha polymorphic site in the markers. In some embodiments, the 
amplification product produced using these primers may be at least ICQ bases in length (i.B. on avcraae 50 nucleotides 
on each side of the polymorphic base). In other embodiments, the amplilication product produced using these primers 
may be at least 500 bases in length (i.e. on average 250 nucleotides on each side of the polymorphic base). In still 
further embodiments, the amplification product produced using these primers may be at least 1000 bases in length (Le. 
on average 500 nucleotides on each side of the polymorphic base). 

The ampSficatioft of polymorphic fregments can be carried as described in Example 8 on DNA samples 
extracted as described in Example 5. 

As already mentioned, allele frequencies of biaHclic markers tested in association studies (individual or 
haplotype) may be determined using microscquencing procedures. 

A first step in miaosequencing procedures consists in designing microscquencing primers adapted to each 
biallelic marker to be genotyped. Microsequcncing primers hybridize upstream of the polymorphic base to be genotyped, 
either with the coding or with tha non-coding strand. Microsequencing primers may be oligonudeoiidcs of 8. 10, 15, 20 
or more bases in length. Preferably, the 3' end of the microsequencing primer is immediately upstream of the 
polymorphic base of the biallelic marker being genotyped, such that upon extension of the primer, the polymorphic base 
is the first base incorporated. Such microsequencing primers are included within the scope of the present invention. 
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In preferred embodiments, the microseguencing primers are those indicated as features within the sequence 
Cstings corrcspunding to markers of SEQ ID Nos: 325-330/331 -336. In some embodiments, the 653 biatldic markers 
obtainad above (which include the sequences of SEQ ID Nos, 1-50 end 5M00 or tlic sequences complementary thereto) 
are genotypcd usino appropriate microsequencinj oligonucleotides such as those of SEQ ID Nos. 201-250 or 251-300, 

It will be appreciated that the WaHelic markers of the present invention may be gcnotypsd using 
mlcrosequencing primers having ony desirable length, and hybridizing to any of Ihc strands of the marker to bo tested, 
provided their design is suitable for their intended purpose. In some embodiments, tliu amplificatinn primers or 
mlcfosequeticing primers may be labeled. For example, in some embodiments, the amplification primers or 
mlcFoscqiJencing primers may be biotinylated. 

Typical microsequendng procedures that can be used in the context of the present invention ore described in 
Example 13 below. 



Examnle 1 3 

Gcnotvpin^ o f btaHelic markers usinf! microseoucncmt^ T^roccdurcs 
Several mlcrosequencing protocols conducted in liquid phase are well known to those skilled in the art. A first 
possible detection analysts allowing the allele characterization of the micfosequencing reaction products relies on 
detecting fluorescent ddNTP- extended microsequencing primers after gel electruphorcsfs. A first alternative to this 
approach consists in performing a liquid phase microsequencing reaction, the analysis of which may be carried out in 
solid phase* 

For example, the microsequencing reaction may be pcrfomierf using S'-biotinylated oligonucleotide primers and 
fluorcsccin-dideoxynucleotides. The biotinylated oligonuclectide is annealed to the t^Lrget nucleic acid sequence 
immediately adjacent to the polymorphic nucleotide position of interest. It is then specifically extended at its 3'-Gnd 
following a PCR cycle, wherein the labeled dideoxynucleotida analog complementary to the polymorphic base is 
incorporated. The biotinylated primer is then captured on a microtitcf plate coated with sUeptavidin. The analysis is 
thus entirely carried out in a microtiter plate format. The incorporated ddNTP is detected by a fluorescein antibody - 
alkaline phosphatase conjugate. 

In practice this microsequencing analysis is performed as follows. 20 jj\ of the microsequencing reaction is 
added to 80 pi of capture buffer (SSC 2X, 2.5% PEG 80Q0, 0.25 M Tris pH7.5, 1.8% BSA, 0.05% Tween 20) and 
incubated for 20 minutes on 9 microtiter plate coated with sueptavidin (BoehringerK The plate is rinsed once with 
washing buffer {0.1 M Tris pH 7.5, 0.1 M NaCI, 0.1 % Tween 201 1 00 fj\ of anti-fiuorescein antibody conjugated with 
phosphatase alkaline, diluted 1/5000 in washing buffer containing 1.8% BSA is added to the microtiter plate. The 
antibody is incubated on the microliter plate for 20 minutes. After washing the microtiter plate four times, 100 pi of 4- 
methyiumbelliferyl phosphate (Sigma) diluted to 0.4 mgfmi in 0.1 M diethanolamine pH 9.6, lOmM M0CI2 are added. The 
detection of the microsequencing reaction is carried out on a fluorimeter (Oynatech) after 20 minutes of incubation. 

As another alternative, solid phase microsequencing reactions have been developed, for which either the 
oligonucleotide microsequencing primers or the PCR-amplIfied products derived from the DNA fragment of interest ara 
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immobilized For example, immobilization can be carried cut via an interaction between biotinylated DNA snd 
5treptavidin-co3tad microtitration wells or avidin-coated polystyrene particles. 

As a further allwriative, tha PCR reaction flcncrating the ampticons to be genotyped can be performed directly 
in solid phase conditions, followino procedures such as those described in WO 96/13009, the disdosure of which is 
5 incorporated herein by rafarence. 

In such solid phase microscquencing raactions, irtcorporated ddNTPs can either be radiolabeled (see Syvanca 
Clin. Chim, Acta, 226:225-236 (1994), the disclosure of which is iicarporatud herein by reference) or linked to 
fluorescein {sae Livak and Haincr, Hum. Matat 3:379-385 {19941 the disclosora of which is incorporated heroin hy 
rcfercficc). The detection of radiolabeled ddNTPs can be achieved through scintilla lion-based techniques. The detection 
10 of fluorsscein-linked ddNTPs can be based on the binding of antiftuorcscein antibody conjuoated with sikalino 

phosphatose, followed by incubation with a chromogcnic substrate (such as p-nitropheny! phosphate). 
Other possible reporter-detection couples for use in the above microscquencing procedures include : 
dtJNTP Bnkad to dinitrophenyi (DNP) and anti-DNP alkaline phosphatase conjugate (see Harju et al., Clin 
ChBm'M[^ 1 Pt l};2282-2287 (1 9931 incorporated htsm by reference) 
^ ^ biotinytatad ddNTP and horseradish peroxidaso-conjugated strcptaviJin with o-phenylenediamine ns a substrate (see 

WO 92/15712, incorporated herein by referance). 

A diagnosis kit based on ffuoresccin-linked ddNTP with anlifluorescein antibody conjuflated with alkaline 
phosphatase has bean commercialized under the name PRONTO by GaoiidaGen Ltd. 

As yet another alternative microsaquencing procedure, Nyren et aL [And Biochcm 208:17M75 (1993), the 
20 disclosure of which is incorporated herein by refarance) have described a sofid-phase DNA sequencing procedure that 

relies on ihc detection of DNA polymerase activity by an enzymatic luminomctric \mm^z pyrophosphate detection 
assay (EUDAl In this procedure, the PCR-amplified products arc faiotinylated and immobilized on beads. The 
microsequencing primer is annealed and four altquots of this mixture are separately incubated with DNA polymerase and 
one of the four different ddNTPs. After the reaction, the resulting fragments ore washed and used as substrates in a 
25 primer extension reaction with all four dNTPs present. The progress of the DNA-dirccicd polymsrizetion reactions is 
monitored with the ELIOA. Incorporation of a ddPTTP in the first reaction prevents the formation of pyrophosphate during 
the subsequent dNTP reaction, in contrast, no ddNTP incorporation in tha first reaction gives extensive pyrophosphate 
release during the dNTP reaction and this leads to generation of light througtout the ELIOA reactions. From the ELIDA 
results, the identity of the first base after the primer is easily deduce! 

1^ will be appreciated that several parameters of the above-described microsequencing procedures may ba 
successfully modified hy these skilled in the art without unduo expcrimentotion. In particular, high throughput 
improvements to those procedures may be elaborated, following principles such as those described further below. 

It will be further appraciated that any other genotyping procedure may bo applied to the genotyping of hiallelic 

markers. 
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Once the candidate region has boon delineated using the high density biallolic marker map, s sequence anafysis 
process will allow tlie detection of all genes located within said regioRi togetlier with a potDnlia[ functional 
characterization of said janes. The identified functional features may allow preferred trail-causing candidates to Ic 
chosen from among the identified genes. More biallclic markers may then he Qonerated within said cendidato genes, and 
used to perform refined association studies that will support the identification of the trait causiny gene. Sequence 
analysis processes are described in Example 14 below. 



OKA sequences, such as BAC inserts, containing the region carrying the candidate ncna associated with the 
detectable trait nre sequenced and thair sequence is analyzed using automated software which eliminates repeat 
sequences whiia retaining pntcntio) gene sequences. The potential gcno sequences arc coiripared to numerous databases 
to identify putuntipl exons using a set of scoring alyerithms such as trained Hidden Markov Models, statistical analysis 
models {including promoter prediction tools) and lha Gf?A!L neural network. Preferred databases for use in this analysis, 
the construction and use of which are further deloilcd in Example 22 below, include the following: 

NetGene database: 

This proprietary databasa contains sequences ol 5' cDNA tags, obtained from a number of tissues and cells. 
Currently more than 50.000 different 5* clones representing more than 50.000 different ynncs ara included in NaiGecte. 
The sequences in the NetGene database correspDnd specifically to the 5' regions of transcripts (first cxonsl and 
therefore allow mapping of the bcQinning of genes within raw Qcnomic sequences. 

MRPU {Non-Rcdundant Protain-Umeue) database ; 

NRPU is a ron-icdundant merge of tha pubStly available NBRFiPlR, Genpept, and SwissProt databases. 
HamoioQies found with NRPU allow tha identification oi regions potentially coding for already known proteins or related 
25 to known proteins [translated cxonsl 



MRfST (NorhRedundant EST database): 

KREST is a marge of the EST subsection of the publicly available EenBank database. Homologies found with 
NflEST allow the location of potentially transcribed regions (translated or ncn-translatcd eions). 



Examnlft 14: gmignca Analysis 
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KRN (Non-Redundant Nucleic acid database): 



NRN is a merge of GenBank, EMBL and thair daily updates* 



Any sequsnca giving a positive hit with NHPU, NREST or an 'excellent" score using GRAIL orfand other scoring 
35 algorithms is considered a potential functional region, and is then considered a candidata for gGnomic analysis. 
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WhiJe this first screening allows tte detection of the 'strongest" exons, a semi-automatic scan is further 
applied to the remeining sequences in the contm of the sequence assembly. That is, the saquencos neighboring a 5* 
site or an exon are submitted to another round of bioinformatics analysis with modified parameters, in this wny, new 
cxon candidates sre generated for genomic analysis. 

5 

Using the above procedures, gtincs associated with detectable traits may be itluntified. 

Examples 15-23 illustrate the application of the above methods using bialielic markers to iiluntify a pna 
associated with a complox disease, prostate cancer, within a ca, 450 kb candidate region. Addilunal details of the 
10 identification of the gene associated with prostate cancer are provided in the U.S. Patent Application entitled Trnstatc 

Cancer Gana" {GENSET.018A, Serial No. 08/996,305), the disclosure of which is incorporated herein by reference. 



m 



Use fif BialleHc Mark ers to Identify a Gene Assoctgtcd with Prostate Cancer 
15 Substantial amounts of LOH data supported the hypothesis that genus associated with distinct cancer types 

are located within a particular reoion of the human genome. More specificaliy. this region was likely to harbor a oeno 
associated with prostate cancer. Association studies were performed as described below in order to identify this 
prostate cancer ijenc, A YAC contig contalrung the Qcnomic region suspected of harboring a gene associated with 
prostate cancer was constructed as described in Example 15 below. 

Exampia 15 

YAC C ontig Construction in the Candidate Gcnon;ireRer;ion 
First, a YAC contig which contains the candidate genomic region was constructed as follows. Tho CEPH- 
Genethon YAC map for the entire human genome {Chumakov ct al. (1 935). supra] was used for detailed contig building 
the genomic region containing genatic markcis knowm to map in the candidate oenomtc region. Screening data available 
25 for several publicly available genetic markers were used to select a set of CEPII YACs located within the candidate 
region. This set of YACs was tested by PGR with tfic above mentioned genetic markers as welt as with other publicly 
available markers supposedly located within the candidate region. As a result of these studies, a YAC STS contig map 
was generated around genetic markers known to map in ttis genomic region. Two CEPH YACs were found to constitute 
a minimal tiling path in this region, with an estimated size of ca. 2 Magabases. 

0""J^9 this mapping effort several publicly known STS markers were precisely located within the contig. 
Example 15 below describes the idsntification of sets of tialieiic markers witliin the candidate genomic region. 

Example 18 
BAC contin construction and 
Bialielic Ma rkers isolation within the candidate chromosomal region. 
^ SAC contig covering the candidate genomic region was constructed as follows. BAC libraries were 
obtained as described in Woo et aL, Nuchk Adds Bes. 22:4922^931 (1994), the disclosure of which is incorporated 
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herein by referencs. Briefly, the two whole human genome BamHI and Hindi!! libraries already described in Example 1 
WGfo con$tructcd using the pBcloBACH vector (Kim ct al. (1996). suprB). 

Tho BAG libraries were then screened with all of tlie above mentioned STSs, following tha procedure described 
in EKptnpfe 2 above. 

The ordered BACs selected by STS screening and verified by FISH, were assembled into conttgs and new 
markers were generated by partial sequencing of insert ends from some of ihcm. These markers were used to fill the 
gaps in the contig of BAC clones covurlnij the candidate chromosomal resion having sn cslimaled size of 2 mcgabascs. 

Figure 8 illustrates a minimal array of overlapping clones which was chosen for further studies, and the 
positions of the publicly known STS markers along said CQnlig. 

Selected BAC clones from the contig were subcloned and sequenced, essentially following the procedures 
described in Examples 3 and 4. 

Biallelic markers lying along the contig were identified fallowing the processes described in Examples 5 and 6. 

Figure 9 shows the locations of the biallelic markers along the BAC conlig. This first set of markers 
corresponds to a medium density map of the candidate locus, with an inter-markcr distance averaging 5QkbO 50kb, 

A second sot of biallelic markers was then generated as described above in order to provide a very higii-density 
map of the region identified using the first set of markers which can be used to conduct association studies, as 
explained below. Tliis very high density map has markers spaced on average every 2'50kb. 

The biallelic markers were then used in association studies. DNA samples were obtained from individuals 
suffering from prostate cancer and unaffected individuals as described in Example 17, 



Prostate cancer patients were recruited according to clinical inclusion criteria based on pathological or radical 
prostatectomy records. Control cases included in this study were both ethnically- and age-matched to the affected 
cases; they were checked for both the absence of all clinical and biological criteria defining the presence or the risk of 
prostate cancer, and for the absence of related familial prostate cancer cases. Both affected and control individuals 
were ali unrelated. 

The two following groups of independent individuals were used in the association studies. The first group, 
comprising individuals suffering from prostate cancer, contained 185 individuals, Q{ these 185 cases of prostate 
cancer, 47 cases were sporadic and 1 38 cases were familial. The control group contained 104 non-diseased individuals. 

Haplotype analysis was conducted using additional diseased (total samples: 281) and control samples (totol 
samples; 13D1, from individuals recruited according to similar criteria. 

DNA was extracted from peripheral venous blood of all individuals as described in Example 5. 

The frequencies of the biallelic markers in each population were dfiten:nined as described in Example 1 8. 



20 



Exampln 17 

Collectinn of DMA Samples from Affected and Non-nffectcd Individuals 
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Example 18 
Genotvpino Affected and Contrel Individuals 
Genotyping was performed using the following microsequendng procedure. 
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Amplification was performBd on each DNA sample using primers designed as previously Bxplained The pnirs of primers 
wars used to genorate amplicons harbaring the bialleiic markers 99-123, 4-2B, 477. 99.217, 4-67, 99-213. 99- 
221, 99-135, 99-1482, 4-73, and 4-65 using tha protocols described in Example 6 above, 

Microsequcncing printers were designed for each cf the hiallclic markers, as previously described. 
After purification of the emplification products, the miaoscqucncinij reaction mixture was prepared by adding, in a 2Q/;I 
final volume; 10 pmol mrcroscqucncing oligomicleotide, 1 U Thermcscquenase (Amersham E73000G1, 1.25 pi 
Themiosequanase buffer (260 mM Tris IIC( pH 9.5, G5 mM MoCl^ and the two appropriate fluorescent ddNTPs (Pcrkin 
Elmer, Dye Terminator Set 401095) complamentary to the nucleotides at the polymorphic site of each hiallclic marker 
tested, following the manufacturer's recommendatimis. After 4 minutes at 94**C, 20 PCR cycles of 15 sec at 55*^^ 5 
sec at 72'*C, and 10 sec at 84 ''C were carried out in a Tetrad PTC-225 tiicrmocycler (MJ Research). The 
unincorporated dya terminators wars then removed by cthanol precipitation. Samples were finally rcsuspcndcd in 
formamide-EDTA loading buffer and heated for 2 min at 25*^0 before being loaded on a polyacrylamide sequencing geL 
The data were collected by an ABI PRISM 377 DNA sequencer and processed using tfie GENESCAN software {Perkin 
Elmer), 

FoHowino gel analysis, data were automatically processed with software that allows the determination of the 
alleles of bialleiic markers present in each amplified fragment. 

The software evaluates such factors as whether the intensities of the signals resulting from the above 
microsequencing procedures are weak, normal or saturated, or whether the signals arc ambiguous. In addition, the 
software identifies sJgnificant peaks (according to shape and height criteria]. Among the sionificant peaks, peaks 
corresponding to the targeted site era identified based on their position. When two significant peaks are detGCtcd for 
the same position, each sample is calegorizcd as homozygous or heterozygous based on the height ratio. 

Association analyzes were then performed using the bialleiic markers as described below. 

Exampia 19 
Association Analysis 

Association studies wera run in two successwe steps. In a first step, a rough localization of the candidate 
gene was achieved by determining the frequencies of the bialleiic markers of Figure 9 in the affected and unaffected 
populations. The rasults of this rough localization are shown in Figure 10. This analysts indicated that a gene 
responsible for prostate cancer was located near the biallaflc marker designated 4-S7. 

In a second phase of the analysis, the position of the gene responsible for prostate cancer was further refined using the 
very high density set cf markers including the 99-123, 4-26, 4-14, 4-77, 99-217, 4-67, 99-213, 99-221, 99-135, 99- 
1482, 4-73, and 4-65 markers. 

As shown in Figure 11, the second phase of the analysis confirmed that the gene responsible for prostate 
cancer was rear the bialleiic marker designated 4-67, most probably within a ca. 1 50kb region comprising the marker. 

A haplotype analysis was also performed as described in Example 20. 
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Extrniplg 20 
Haplotvpe analysis 

Thfl allelic frequencies of each of the alleles of btaildic m\km 99423. 4-14. 4-77, S3-217, 4 67, 99- 
213, 99-221, and 99-135 wara deicmincd in the affected and unaffected populations. Table 4 lists the inTcrnal 
Identification numbers of the markers used in the haplotype analysis, the alleles of aacti mnrkur, the most frcquuiil allele 
in both unaffected individuals and individuals suffering from prostate cancer, llic least frequent allele in both unaffected 
individuals and individuals $uffcrin[j from prostata cancer, and the frequencies of the least frequent alleles !n each 
population. 

Table 4 

Fraquuncy of ieost frequent allulo ** 



Markon 


Polymorphic base * 


Cases 


Controls 


93-123 


cn" 


0.35 


0.3 


4-26 


A/G 


0.39 


0.45 


4-14 


CfT 


0.35 


0.41 


4-77 


C/G 


0.33 


0.24 


99-217 


err 


0.31 


0.23 


4-67 


C(T 


0.26 


0.16 


99-213 


TfC 


0.45 


0.3B 


99-221 


CIA 


0.43 


0.43 


98135 


AIG 


0.25 


0.3 



most frequent allcielleast frequent allele 
standard deviations - 0,023 to 0.031 for controls 
•0.018 to 0,021 for cases 



Among all the theoretical potential different haplotypes based on 2 to 9 markers, 1 1 haplotypcs showing a 
strong association with prostate cancer were selected. The results of these haplotype analyzes are shown in Figure 1 2, 

Figures 11« and 12 aggregate association analysis results with sequencing results - generated following the 
procedures further described in Example 21 - which permitted the physical order and/or the distance betwasn markers to 
be estimated. 

The significance of the values (obtained in Figure 12 are underscored by the following results oi computer 
simulations. For the computer simulations, the data from the affected individuals and the unaffected controls were 
pooled and randomly allocated to two groups which contained the same number of individuals as the affected and 
unaffected groups used to compile the data summarized in Figure 1Z A haplotype analysis was run on these artificial 
groups for the six markers included in haplotype 5 of Figure 12- This experiment was reiterated 100 times and the 
results are shown in Figure 13. Among 100 iterations, only 5% of the obtained haplotypes are present with a p-value 
lass significant than E-04 as compared to the p-value of 9^-07 for haplotype S of Figure 12. Furthermore, for hsplotype 
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5 of figure 12, only 6% of the obtained haplotypes havs a siunificanca lavel boiow 5^-03, while none of them show a 
significance level below 5^-03. 

Thus* using the data of Figure 13 and evaluating the associations for single marker aRelos or for haplotypes 
will permit estimsticn of the risk a corresponding carrier has to develop prostate cencer. It will be approciatnd (tint 
significance thresholds of relative risks will be more finely assessed according to the population tested. 

Diagnostic techniques for determining nn indlviduei's risk of developing prostate cancer may be implsmented as 
described below for the markers in tlic mops of the present invention, including the 99-123, 4-25, 4*14, 4J7, Q9-2t7, 
4-87, 39-213, 99-221, and 99-135 markcrt 

The fibove haplotypc analysis indicated that 171kfa oi genomic DNA between biallelic markers 4-14 and 99- 
221 totally or partially contains a geno responsible for prostate cancer. Therefore, the protein coding sequences lying 
within this region were characterized to locate the gene associated with prostate cancer. This analysis, described in 
{tifther detail below, revealed a single protein coding sequence in the 171 kb genomic region, which was designatud as 
the PG1 gene, 

f xgmolg 21 

IdiintJficntion of the Genomic Sefrocnce in the Candidata Rgaion 
Template DMA for sequencing the PGI gene was obtained as follows. BACs E and F from Fig. 9 were subcioned 
as previously dcscrlhel Piosmid inserts were ftrst smpfificd by PGR on PE 9600 ihennocydcrs (Perkin-Elmer), usiny 
appropriate primers, AmpSTaqGold (Perkin-Etmer), dNTPs (Boehringer), buffer and cycCng conditions as roccnuiiended by the 
Perkin-Elmcr Corporation. 

PGR products were then sequenced using automatic ADi Prism 377 sequencers (Parkin Elmer, Applied Biosystems 
Division, Foster City, CA), Sequencing reactions were performed using PE 3B0O thermocyclers [Perkin Elmer) with standard 
dye-primer chemistry and TliermoSequenase (Amersham Lifa Science). Tlie primers wara labeled with the JOE, FAM, ROX 
and TAMRA dyes. The dffFPs and ddNTPs used in the sequencing rBactions were purchased from Boehringer. Sequencing 
buffer, reagent tonccrttrations and cycling conditions wera as recommended by Amershanv 

Following the sequencing reaction, the samples were preriphated with EtOH, rcsuspended in formamide loading 
buffer, and loaded on a standard 4% acrylamida gel Hectrophoresis was performed for 2.5 hours at 3000V on an ABI 377 
sequencer, and the sequence data wens collected and anah^ed using the ABI Prism DNA Sequencing Analysis Software, 
version 

The sequence data obtairu^d as described above were transferred to a propriety database, whore quality control 
and validation steps were performei A proprietary base-caUer flagged suspect peaks, taking into account the shape of the 
peaks, the interf eak resolution, and the noise level. The proprietary base-caller also performed an automatic trimming. Any 
stretch of 25 or fewer bases having more than 4 suspect peaks was considered unrefiablG and was discarded. 

The sequence fragments from BAC subclones isolated as described above were assembled using Gap4 
software from R. Staden (Bonfield et al, 1995). This software allows the reconstruction of a single sequence from 
sequence fragments. The sequence deduced from the alignment of different fragments is called the tonsensys 
sequence. Directed sequencing techniques (primer walking) were used to complete sequences and link contigs. 
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Potential functional 5Bquencfls were then identified as described in Example 21 

Exempfe 22 
Identification of Fiincttnnnl Seoucnces 

Potential cxons in BAC^erivcd human genomic sequences were located by liornology searches on protein, niiclcic 
acid and EST (Expressed Sequence Tsgs) public databases. Main public databases were locally reconstructed as mentioned 
In Example 14, The protein database, NRPU (tJon-rcdundant Protein Unique) is fcmiod by u non-redundant fusion of iho 
Genpept {Benson et aL, Nudm Adds Res. 24:1-5 (1996), the disclosure of which is tncorpcratod herein by reference), 
Swissprot (Bairoch, A. and ApweJcr, R, NuckkAcWs Rss, 24:21-25 (1996), the disclosure of which is incorporated herein 
by reforowc) Ond PIR/NBRF (Ccoroc et al, Nudeic Acids Res. 24:17*20 (1995), the disclosure of which is incorporoted 
herein by reference) databases, fiedondant data were eliminated by using the NflOB software (Benson et al. (1996), suprs] 
and internal repeats were masked with the XNU software (Benson ct aU svprBl Homologies found using the NRPU 
database allowed the identification of sequences corresponding to potential coding exans related to known proteins. 

Tlic EST local database is composed by the gbest section {V9) of GcnBank (Benson et al. (1995), svpnl and thus 
contains all publicly available iranscnpt fragments. HomoloQies found with this database anowed the Ioc2li2ation of 
potentially uanscnbed regions. 

The local nucleic acid database contained all sections of GcnBank and EMBL {nodriguez-Tonic et a!., Nucleic Acids 
R&s. 24:0-12 (1996), the disclosure of which is incorporated herein by reference) except the EST sections. Redundant data 
were eliminated as previously described. 

Similarity searches in protein or nucleic acid databases were performed using the BLAST software (Alischul ct 
J. MoL DioL 215:403410 (1990), the disclosure of which is incorporated herein fay reference). Alignments ware refined 
using the Pasta software, and multiple alignments used Ctustal W. Hofnology thresholds .wtfra adjusted for each analysis 
based on the length and the complexity of the tested region, as well as cn the size of the reference database. 

Potential exon sequences idcntrfied as above were used as probes to screen cDNA libraries. Eitrcmities of positive 
clones were sequenced and the sequence stretches were positioned on the genomic sequence determined above. Primers 
were then designed using the results from these alignments in order to enable the clonino of cDNAs derived from the gene 
associated with prostate cancer that was idenlrfed using the above proceduns. 

The obtained cDNA molecules were then sequenced and results cf Northern blot analysis of prostate mRNAs 
supported the existence of a major cDNA having a S-Bkb length. The stmcture of the gene associated with prostate cancer 
was evaluated as described in Example 23. 

Example 23 
Analysis of Gene Structure 

The intron/exon structure cf the gene was finally completely deduced by aligning the mRNA sequence from the 
cDNA obtained as described above and the genomic DNA sequence obtained as described above* This alignment 
permitted the determination of the portions of the introns and exons, ths positions of the start and end nucleotides 
defining each of the at least 8 exons, the locations and phases of tha 5' and 3' splice sites, the position of the stop 
codon, and the position of the polyadenylation site to ba determined in the genomic sequence. This analysis also yielded 



wo 99/04038 





PCT/IB98/01193 



the positions of ihs coding region in the mRNA, and the locations of the polyadenylation signal and poIyA stretch in the 
mRNA. 

The gena identified as described above compriies at least 8 cxuns and spans more than 52kb. A G/C rich 
putative promoter region was identified upstream of tlic coding sequence. A CCAAT in the putative promoter was also 
5 itfcnlificd. TTio promoter reoion was identified as described in Prestridge, aS. Preificting Pol II Promoter Sequences 

Usino Transcription Factor Binding Sites, J. Md BioL 249:923-932 (1995), the disclosure of which is incorporated 
heroin by reference. 

Additional analysis using conventional techniques, such as a 5'RACE reaction using the Marathon-Ready 
human prostate cDNA kit from Clontcch (Catalog. No. PT1 15G-1L may be performed to confirm that tha 5' of the cDNA 
1 0 obtained above is ths authentic 5' end in the mRNA. 

Alternatively, the 5'scquence of the transcript can be determined by conducting a PCR amplification with a 
series of primers extending from the 5'end of the identified coding region. 

The above methods were also used to identify biallelic markers in a gene which was an anractivc candidate for 
a gene associated with asthma. Examples 24-31 show how the use of methods of the present invention allowed this 
15 gene to be identified as a gene responsible, at least partially, for asthma in the studied populations. Additional details of 
the identification of the gene associated with asthma are provided in U.S. Provisional Application Serial Nos, 
60/081,893 (GensBt.025PRl and U.S. Provisional Patent Application Genset.026Pfl2, the tlisclosures of which are 
incorporated herein by reference. 



Donors were unrelated and healthy. They presented a sufficient diversity fot btfing rBpresentative of a French 
heterogeneous population. The DNA from 100 individuals was exiractsd and tested for the detection of the biallelic 
markers. 

30 ml of peripheral venous blood were taken from each donor in the presence of EOTA. Cells (pellet) were 
25 collected after centrifugation for 10 minutes at 2D0Q rpm. f^cd cells were lysed by a lysis solution (50 ml final volume : 

10 mM Tris pH7.6; 5 mM MgC12; tO mM NaCI). The solution was centrifuged (10 minutes, 2000 rpm) as many times as 
necessary to eliminate the residual red cells present in the supernatant, after resuspension of the pellet in the lysis 
solution. 

The pellet of white cells was lysad overnight at 42**C with 3.7 ml of lysis solution composed of; 
30 - 3 ml TE 10-2 fTris-HCl 10 mM, EDTA 2 mM) I NaCl 0.4 M 

•200//ISDS10% 

' 500 /rf K-protcinase (2 mg K-proteinase in TE 10-2 / NaCl 0.4 Ml 

For the extraction of proteins, 1 ml saturated NaCl (6M) {1/3.5 v/v) was added. After vigorous agitation, the 
solution was cantrifuged for 20 minutes at 10000 rpm. 
35 For the precipitation of DMA, 2 to 3 volumes of 100% ethanol ware added to the previous supernatant, and the solution 

was centrifuged for 30 minutes at 2000 rpm. The DNA solution was rinsed three times with 70% ethanol to eliminate 



20 



Example 24 

nntcction of hinllgfic mnrkcrs in the cnndidflte oena: DNA extraction 
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salts, and centrifuged for 2D minutes at 2000 rpm. The psHst was dried at 37**C, and resuspended In 1 ml TE 10-1 or 1 
ml water, Ths DMA concentration was ovaluatod by measunng the OD at 260 nm (1 unit OD - 50 //g/ml DNA), 

To determine the presence of proteins in tha DNA solution, tha OD 260 / OD 280 ratio was dotarminad. Only 
DNA preparations having a DD 260 / OD 280 ratio bmwcen 1.8 and 2 were used in the siibsoqusm oxamplos doscribcd 
below. 

Tho pool was constituted by mixing equivalent quantities of OKA from each individual. 

Exampifl 25 

Dntectinn of thi^ hianelfn mnrkars: nrnpliftcadon of ncnomic DWA by PCR 
The ampRficatlon of specific genomic sequences of the DNA samples of Example 24 was carried out on the 
pool of DNA obtained previously. In addition, 50 individual samples were similarly amiilified. 

PCR assays were performed using the following protocol: 
Final volume 
DNA 
r^gCIZ 
dNTPIeach} 
primer (fiach) 

Ampli Taq Gold DNA polymerasB 
PCR buffer (lOx - 0.1 M TrisHCI pH8.3 0.5M KCl) 1x 

Pairs of first primers were designed to amplify the promoter region, cxons, jmd 3' end of the candidate asthma- 
associated gene using tiie sequence Informatiofi of the candidate gene and the DSP software (Hillior & Green, 1991}. 
Those first primers were about 20 nucleotides in length and contained a common oligonucleotide tail upstream of tha 
specific bases targeted for amplification which was useful for sequencing. The synthesis of these primers was 
performed following the phosphoramidite method, on a GENSET UFPS 24.1 synthesizer. 

DNA amplification was performed on a fienius II thormocyclor. After heating at 94**C for 10 min, 40 cycles 
were performed. Each cycle comprised: 30 sec at 84'C, 55'C for 1 min, and 30 sec at 72" C. For final elongation, 7 min 
at 72**C ended the amplification. The quantities of the amplification products obtained were determined on 9S-weII 
microtiter plates, using a fluorometer and Picogreen as intercalant agent (Molecular Probes). 

Example 2G 

Detection of the faia llelic markers: sequencing of amolifiod oenomic DNA and identification of palvmnrphisms 
The sequencing of the amplified DNA obtained in Example 25 was carried out on ABi 377 sequencers. The 
sequences of the amplification products were determined using automated dideoxy terminator sequencing reactions with 
a dye terminator cycle sequencing protocol. Tha products of the sequencing reactions were run an sequencing gels and 
the sequences were analyzed as formerly described. 



25 //I 
2 ng/yul 
2mM 
200 ;/M 
2.9 ng///I 
0.05 unit//;! 
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The sequence data were further evaluaied using the above manlioned polymorphism analysis software 
designed to detect the presence of bialleiic markers amonfl the pfloled amplified fragmcnis. The polymorphism search 
was based on the presence of superimposed peaks in the electrophoresis pattern resulting from diffcrunt Ijpsw occurring 
at the same position as dascrihed previously, 
5 Six fragmerits of amplification were analyzed. In these segments, 8 biallciic markers were detected. Tlic 

localization of the biallciic markers, the polymorphic bases of each allele, and tha frequencies of the most frequent 
alleles was as shown in Table 5. 



TafalQ 5 

10 





Atnplican 


MarkerlUame 


Origin of OHA 


Localization In 
gene 


Polymorphism 


FtoquBiicy 




1 


204/326 


Ind. 


Promoter 


A/G 


96.2 (Gl 




2 


32/357 


Puol 


Intron 1 


A/C 


67.7 {C) 


15 


3 


33/175 


tnd. 


ExDn2 


C/T 


87.3 {CI 




3 


33/234 


Pool 


inUon 2 


A/C 


56.7 (C) 




3 


331327 


Ind. 


Imron 2 


C/T 


75.3 (T) 




5 


35/358 


Pool 


Intron 4 


C/G 


67.9 (G) 




5 


35/390 


Ind. 


Intron 4 


If 

C/T 


82(C) 


20 


6 


36/154 


tnd. 


ExanE 


A/G 


99.5 (G) 



Allelic frequencies were determined in a population of random blood donors from French Caucasian origin* Their wide 
range is due to the tact that besides screening a pool of 1 QO individuals to generate biallelic markers as described 
above, polymorphism searches were also conducted in an individual testing format for 50 samples. This strategy was 
25 chosen here to provide a potential shortcut towards the identification of putative causal mutations in the association 
studies using them. As the 36/1 64 biallelic marker was found in only ona individual this marker was not considered in 
tlie association studies. 

The fourth fragment of amplification carrying exon 3 (not shown in the Table) was not polymorphic in the 
tested samples (1 pool ^ 50 individuals). 



wo 99/04038 



PCT/IB98/01193 



•61- 

Example 27 

Vniiriatiort of the poivmorDhtsms thmmh microsEotjencinn 
The biallelic markers identiJied in Example 2B were f urtticr confirmed and their rcspDctlve frequtincics wura 
determined throuflh microsoqucncing. MicrosequGncing was carried out for each indivitfual DNA sample described in 
Example 24. 

Amplification from goHorrac DMA of individuDls wps performed by FCR as de;;t:(ibud abtive for tlin dulection of 
the biallelic markers with the same sut of PCR primers described above. 

The prefcrrod primers used in microscquuncing had about 19 nucleotides in length and hybridized just upstrcnm 
of the considered polymorpiuc base. 

Five primers hybridized with the non-coding suand of the gene. For the biallultc markers 204/326, 35/358 and 36/1 64, 
primers hybfidizeJ with the coding strand of tfie gene. 

The microsequencing reaction was performed as described in Example 18, 

Eynmple 2B 

Association study between asthma and the bialielic markers of the candidate gene: cotloction of DNA samples from 

affected and nnn-af fectcd indiyiduals 
The asthmatic popuJation used to perform association stutiies in order to cstabiish whether the candidate gene 
was an asthma-causing gene consisted of 298 individuals. More than 20 % of these 298 asthmatic individuals had a 
Caucasian ethnic background. 

The control population consisted of 373 unaffected individuals, among which 279 French (al least 70 % were 
of Caucasian origin} and 94 American (at least 90 % vvors of Caucasian origin). 

DMA samples v/ere obtained from asthmatic and non-asthmatic individuals-as'dcscribed above. 

Example 29 

Association study between astiima and the biallelic markers nf the candidate cene: nenowptng of nffcrtnd and amtrol 

individuals 

The general strategy to perform the association studies was to individually scan the DNA samples from all 
individuals in each of the populations doscrihed above in order to establish the allele frequencies of the above described 
biallelic markers in each of these papulations. 

Allelic frequenciES of the above-described bialtolic markers in each population were dotermincd by performing 
microsequencing reactions on amplified fragments obtained by genomic PCR performed on the DNA samples from each 
individual. Genomic PCR and microsequencing were performed as detailed above in Examples 25 and 27 using the 
described amplification and microsequencing primers. 

Example 30 

Association study between asthma and the biaHelic markers of the candidate ncne 
Table 6 shows the results of the association siudy between five bialielic markors in the candidate gene and 

asthma. 
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Tabic 6 

AilGlic frequencies (%) 



Markers 


Asthmatics 


Controls 


Frequency diff. 


P value 




298 individuals 


373 individuals 






321357 


A38.e 


A20.B 


8.8 


7.34x10"* 


33/234 


A 49 


A 44.3 


4.7 


8,86x10^ 


33/327 


T78.5 


T74.6 


3.9 


I.OxlO"' 


35/359 


G72.3 


G66.9 


5.4 


3.53x10'^ 


35/330 


T30.4 


T2D.3 


10.1 


2.33x10*^ 



As shown in Table 6, markers 32/357 and 35/320 presentod a strong association with asthma, ihis association being 
highly significant ( pvalus - 7.34x10-4 for marker 32/357 and 2.33x1 0-5 fnr marker 35-390). 

Three markers showod modorata association when tested independently, namely 33/234, 33/327, 35/358. 
's worth mentioning tiiat alldic frequencies for each of the biallelic markers of Table 6 were separately 
measured within the French control population (279 individuals! and ihe American cuntrol pupulalion {34 individuals). 
The differences in allele frentjoncics botwaen tha two populations were between 1 % and 1% with p-valucs above 10'\ 
These data confirmed that the combined French/American control population (373 individuals) was homogeneous enougli 
to bo used as a control population for the present association study. 

20 

Examnis 31 

Association studies: Hanlotyne freouency analysis 
As already shown, one way of increasing the statistical power of individual markers, is by performing 
haplotype association analysis. A haplotype analysis for association of markers in the candidate gone and asthma was 
25 parformed by estimating the frequencies of all possible haplotypcs far biallolic markors 32/357, 33/234, 33/327, 35/358 
and 35/390 in the asthmatic and control papulations described in Example 30 fTable 6), and comparing thoso froquancies 
by means of a chi square statistical test (one degree of freedom). Haptotype estimations were performed by applying the 
Expectation^MaxinKzatlon [EM) algonthm fExcofficr L & Slatkin 1995, MoKBioI.Evol, 12:321.927), using the EM- 
HAPLO program (Hawley ME, Pakstis AJ & Kidd KK, 1994, AmJ.Phys.AnthropoL 18 : 104). 
30 The results of such haplotype analysts are shown in Tabic 7. 



35 
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Table 7 



liaplutypc 
{ruqucncics 



Markers 


321357 


33{234 


331327 


351358 


35(3 9U 


Astlitn. 


Controls 


QilJt rotla 


Proquoncy diff. 


8.6 




3-3 


5,4 


10.1 








valu« 


7.34x10^ 




l.OxtO'^ 


3.53x10^ 


2.33x10^^ 








llaplotype 1 


A 








T 


0.2 


fl.11 


2.02 


lldplotYpfi 1 




A 


r 


G 




0.27 


0.18 


\M 


llaplotypa 3 


A 


A 


T 


G 


T 


0.18 


Q.05 


2.22 



A twa-marker haplotype covering markers 32/357 and 35/390 (haplotype 1, AT allclss respectively) prescnled 
15 a p vaiue of 8.47xlO-B, an odds ratio of 2.Q2 and haplotype frequencies of 0.2 for 2s:hmaTic and 0.1 1 lor cuiitrul 

populations respectively. 

A Ihrcc-markcr haplotype covering markers 33/234, 33/327 and 351358 (haplotype 2, ATG alleles respoctivoly) 
presented a p value of 2.81x10-4. an odds ratio of 1.66 and haplotype ffequcncics of 0.27 for asthmatic and 0.18 for 
control papulations respeclivDiy. 

20 A five-marker haplotype covering markers 32/357, 331234, 33i327, 35/358 and 3J5/390 (haplotype 3, AATGT 

alleles respectively) presented a p value of 3.95x10-5, an odds ratio of Z22 and haplotype frequencies of 0.18 for 
asthmatic and D.09 for controi populations respectively. 

Haplotype association analysis thus increased tho statistical power of the individupl marker association 
studies vyhen compared to single-marker analysis {from p values botweDn 10*^ and 2X10'^ for the individual markers to p 
25 values between 3X10^ and 8X10'^ for the three-markar haplotype, haplotype 2|. 

The significance of the values obtained for the haplotype association analysis was evaluated by the fallowing 
computer simulation test. The genoiypa data from the asthmatic and control individuals were pooled and randomly 
allocated to two groups which contained the same number of individuals as the trait positive and trait negative groups 
used to produce the data summarised in Table 7. A haplotype analysis was then run on these artificial groups for the 
30 three haplotypes presented In Table 7. This experiment was reiterated 1000 times and the results ara shown in Tablo 8. 
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HoplOtvpe 
HaplotypB 1 

llaplutypcZ 
{ATG) 
Hap(otype3 
(AATGT) 



Cht'Squarc 



1S.70 



13.49 



16,G5 



-64. 
Tables 
Pormutation Test 
AvcrancChi-Squaro 

1.2 

1.2 
1.2 



Maximal Chi-Squaro F vahic 



11.5 



10.5 



9.3 



1,0x1 0"* 



1.0x10 



1.0x10" 



3 



10 



The results in Tdhk 8 shflw that among 1000 iterations only 1%a of the obtained haplotyiius has a (lvalue 
comparable to the one obtained in Tablo 7. 

Those rfisuits claarCy validate the statistical significance of the liaplotypes obtained (haplotypos 1, 2 and 3, 

15 Table?). 

V/hile Examples 1 5-31 illustrate the use oi the mops and markers Df the present invention ior Identifying a m 
Qene nssociated with a complex disease within a 2Wlb genomic region lor establishing that a candidate gene is, at least 
partlaiiy, responsibte for a diseaso. the maps and markers nf the present invention may also be used to identify one or 
mote bialielic marlors cr one or more genes associated with other detectable phenotypes, inciuding drug response, drug 
20 toxicity, or drug cfiicacy. The biailelic markers used in such drug response analyses or shown, using the methods of the 

present invention to be associated with such traits, may He within or near genes responsible for or partly responsible for 
a particular disease, for example a disease against which the drug is meant to act, or may lie within genomic regions 
which are not responsible for or partly responsible for a disease. For example, the tjenomic region harboring markers 
associated with a particular drug response may carry a drug metabolism gene, or a gene encoding a protein with a role in 
25 the drug response mechanism. Thus, bialielic markers within or near genes known to be involved in drug response, 

toxicity, or efficacy or genes suspected of being involved in drug response, toxicity, or efficacy may be U38d to identify 
individuals likely to respond positively or negatively to drug treatment. In the context of the present invention, a "positive 
rBsponse" to a medicament can be defined as comprising a reduction of the symptoms related to the disease or condition 
to be treated. In the context of the present invention, a 'negative response" to a medicament can be defined as 
30 comprising either a lack of positive response to the medicament which does not lead to a symptom reductioa or to a 
side-effect observed following administration of the medicament 

Drug efficacy, response and toleranceftoxicity can be considered as multifactorial traits involving a genetic 
component in the same way as complex diseases such as Alzheimer's disease, prostate cancer, hypertension or diabetes. 
As such, the identification of genes involved in drug efficacy and toxicity could be achieved follDwlng a pasitlonal cloning 
35 approach, 8,5. performing linkaga analysts within families in order to obtain the suhchromosomal location q! the genels). 
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However, this type of analysis is actually impfactical m the case of drug rsspansivcncss, due to the (ock of nvailability of 
familial cases. In fact tlw likelihood of having more than one individual in a paniculaf family being exposed to the snme 
drug at the same time is very low. Therefore, drug efficacy and toxicity can only be analyzed as sporadic traits. 

In order to conduct association studies to analyze the individual response tu n givon drug in groups al patiurits 
affacted with a disease, up to im groups are sniccncd to detsrminG tlieir patterns of biallolic mnrkcrs usinn the 
techniques described above, Tlic four groups arc: 

• Non-diseasod or random controls, 
- Disoased paticnts/druij ruspnndurs, 

■ Diseased patients/drug non-rospondGrs, 

• Diseased patients/drug side effects. 

In preferred embodiments, the above mcntioneii groups are recruited according to phanotyping ciitcria h:m\\} 
the characteristics described above, so (hat the phcnotypes defining the different groups are non-ovorlapping, priiferahly 
extreme phcnotypes. 

In highly preferred embodimDnU, such phsnotyping criteria have the bimodal distribution dDScribod nhuve. 
The fmai number and composition of the groups for each drug association study is adapted 
to tlie distribution of the above described pheaotypes witliin tl:e studied population. 

After selecting a suitable population, association and hapiotypc analysss may be pcrfcrtned as 
described herein to identify one or more hiallelic markers associated with drug response, profcrably druy toxicity or drug 
efficacy. Tlie identification of such one or more hiallelic markers allows one to conduct diagnostic tBSts to determine 
whether the administration of a drug to on individual will result in drug rosponse, preferably drug toxicity, or drug 
efficacy. - " 

The methods described above for identifying a gene associated with prostate cancer and biallelic markers 
indicative of a risk of suffering from asthma may bo utilized to identify genes associated v/tth uther detectahlo 
phenotypes. In particular, the above methods may be used with any marker cr combination of markers includsd in llie 
maps of the present invention, including the 653 biallelic markers obtained above (which include the soqucnces of SCQ 
10 No5» V50 and 5V100 or the sequences complementary thereto), the PG1 markers, the asthma-associated markers, 
and the Ape E markers of SEQ ID Nos. 301-305/307-31 1 or the sequences complementary thereto. As described above, 
the general strategy to perform the association studies using the maps and markers of the present invention is to scan 
twp groups of individuals (trait positive individuals and trait negative controls! characterized by a well definad phonotypa 
in order to measure the allele frequencies of the biallelic markers in each of these groups* Preferably, the frequencies of 
markers with inter-markor spacing of about 150 kb are determined in each groups, fi^nre preferably, the frequencies of 
markers with inter-marker spacing of about 75 kb are detenninod in each group. Even more preferably, markers v^ilh 
inter-marker spacing of about 50 kh, about 37.5kb, about 30kb, or about 25kb will be tested in each population. For 
genome-wide studies, it will he preferred to measure the frequencies of about 20,000, or about 40,000 biallelic markers 
in each group, in a highly preferred emhodimsnt, the frequencies of about 60,000, about 80,000, about 100,000, or 
about 120.000 biallstic markers are determined In each group. In soma embodiments, haolotvpa anaWses mav be run 
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using groups of markers located within regions spanning less than Ikb, from 1 to 5kb, from 5 to lOkb, from 10 to 25kb, 
from 25 to 5Dkb, from 50 to 150kb, from 150 to 250kh,from 250 to 500kb. from 500kb to 1Mb, m more than 1Mb, 

Allele frequency can be mGasured using microsequencing tDcliniqucs described herein; preferred iiigii 
throughput microsoquoncing procedurEs are further exemplified below; it will be further appreciated that any other large 
scale gonotyping mcthoiJ suitable with tlic intended purpose contemplated hereirt rnay also bt; used. 

In seme embodiments of tlic present invention a compuTor-based system may support the on-line coordiiialimi 
bulwDiin the identification of biallelic markers and the corresponding analysis ol their frequency in the different yruups. 

It will be appreciated that it is not nacessary to use a full hi[]li density biallelic marker map iu order la start a 
gcnomc-wide associaticn study, it is suflicfent to gcncrata and use a first scl of about 20,000 markers (one marker per 
BAC, average inter-marker spacing of about 150kb). Maps having higiicr densities of biadcfic marksrs (two or more 
markers per BAG, average inter-markBr spacing of about 75kb or less) may then be generated by starting first on ihosi^ 
BACs for which a candidate association has been established at tlie first step. 

In cases when one or more candidate regions ba\rc previously bsen dolincated, such as cases where a particular 
gene or genomic region is suspected of being associated wilh a trait, local excerpts of bialletic marker maps having 
densities above one marker per 1 5Qkh may be exploited using BACs harboring said genomic logions, or genes, or portions 
thereof, tn these cases also, successive association studies may be performed using sets of biallelic markers showing 
increasing dansiiiss, prefarafaly from about one every 150 kb to about one every 75kb: more preferably, sets of markers 
with mtcr-markcr spacing below about 50kb, betaw about 37.5kb, below abou: 30kb, most proforably below about 25 
kh, will bo used. 

Haplotyps analyses may alsa be conducted using groups of bialleiic markers within the candidate region. The 
biallelic markers included in oach of these groups may be located within a otinomic region spanning loss than Ikb, from 1 
to 5kb, from 5 to lOkb, from 10 to 25kb, from 25 to SOkb, from 50 to IBOkb, from 150 to 250kb, from 250 to 50Qkb, 
from 5DQkb to 1Mb, or more than 1Mb. It wfll be appreciated that the ordered DNA fragments containing these groups of 
biallelic markers need not completely cover the genomic regions of thoso lengths but may instead be incomplete conligs 
having one or mors gaps theraln. As discussed in further detail below, biailelic markers may be used in association studies 
and haplotypa analyses regardless of the completeness of the correspunding physical contig harboring them, provided linkage 
disequilibrium between the markers can be assessei 

As described above, if a positive association with a trait, such as a disease, or a drug efficacy and/or toxicity, 
is identified using the biallelic markers and maps of the present invention, the maps will provide not only the 
confirmation of the association, but also a shortcut towards the identification of the gene involved in the trait under 
study. As described above, since tha markers showing positivfi association to the trait are in linkage disequiiibrium with 
the trait iDci, the causal gene wQI be physically located in the vicinity of these markers. Regions identified through 
association studies using high density maps will on average have a 20 • 40 times sharter length than those identified by 
linkage analysis (2 to 20 Mb). 

As described above, once a positive association is confirmed with tha high density biallelic marker maps of the 
nresent invention, BACs from which the most highly associated markers were derived are completely seauenced and the 
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mutations in Tho causal gene are searched by applying genomic analysis took. As descrihod above, ones a region 
harboring a gGne associated with a detectable trait has heen sequenced and analyzed, the candidate functianal rcQions 
(eg. Exons and splice sites, premoters and other regulatory rsgians) arc scanned foi mutations by cuinparing the 
sequences of a SQlDCted number of controls and cases, using adequate software, 

in same embodiments, trait positive samples being cnn\pnicd to idoniify cnusal mutations arc selected amoiiy 
those cnrryini] the ancestral haplotYpc; in these embodimonts, control samples are chosen from individuals not carrying 
said ancestral hapiotype. 

In further eiiibodimcnts, trail positive samples being compared to identify causal mutations are selected urnong 
those showing hopiotypcs thai ore as close as possible to the ancnslral haplotype; in these embodiincnts, control 
samples arc chosen from individuals not carrying any of the ijnplotypes selected for the case populatlen. 

"Rie mutation detection procedure is essentially similar to that used for blallalic site identification. A pair of 
oligonucleotide primers are designed in order to amplify the sequonces to be tested. In preferred embodiments, prinrity is 
given to the testing of functional sequences; in such embodiments, sequences covering every exonipromuter predicted 
region, preferably including potential splice sites, are detcrminiid and compared bulwrjcn the T+ and T- papulations. 
Amplification is carried out on DNA samples from T-k and T- individuals using the polymerase chain reaction under the 
above described conditions. To he sequenced, amplification products from genomic PCR may be subjected tu automated 
dideoxy terminator sequencing reactions and eleclrophoresed on ABI 377 sequencers. Following gel image analysis and 
DNA sequence extraction, ABI sequence data arc automatically analyzed to delect the presence of sequence variations 
among T+ and T- individuals. Sequences are preferably verified by comparing the sequences of both DNA strands of 
each individual. 

it is preferred that candidate polymorphisms be then verified by screening *a targcr population of cases and 
controls by means of any genotyping procedure such as those described herein, pieferably using a micioscqucncing 
technique in an individual test format. Polymorphisms aie censidsiBd as candidate mutations when present in cases and 
controls at frequencies compatible with the expected asstjciation results. 

The maps and biallelic matters of the present invention may also be used to identify patterns of biallelic 
markers associated with detectable traits resulting from polygenic interactions. The analysis of genetic interaction 
between alteies at uniinked loci requires individual genotyping using the techniques described herein. The analysis of 
allelic interaction among a selected set of biaiieiic markers whh appropriate p-values can he considered as a haplotype 
analysis, similar to those described in further details mx\\\(\ the present invention. 



In addition to their utility in searches for genes associated with detectable traits on a genome-wide, chromosome- 
wide, or subchromosomal level, the maps and biallelic markers of the present invention may be used in more targeted 
approaches for identifYing individuals likely to exhibit a particular detectable trait or individuals who exhibit a particular 
detectable trait as a consequence of possessing a particular allele of s gene associated with the detectable trait. For 



Use of Biallfilic Markors to Identify individuals llkaly to Exhibit a Detectable 
Trait Associated wiih a Particular Allele of a Known Gene 
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example, the biadelic markors and maps of the present invention may be used to identify individuals who carry an allele of a 
known gene that is suspected of being associatqd with a particular deiectahle trait. In particular, the target oenes may ba 
gonos fjavmg alleles which predispose an individual to suffer from a specific dissasa state. In other cases, the targot gonos 
may be Qcnes having aifclcs that predispose an individual to exhibit a desired or undesircd response to a drug or other 
pharmaceutical composition, q food, or any administered compound. Tliu known gene may uncudi; any of a variety of typos 
of Wamolccuios. For example, the known genes targeted in such analyzes may bo genes known to be involved iji a parlii:ular 
step in a metabolic pathway in wiiidi disruptions may cause a detectable trail Allurnativoly, the target uerios may bi» ycnos 
encodiuQ receptors or Goands which bind to receptors in which disruptions may cause a dcteciahia trail, ycnes encoding 
transponers. genes encoding proteins with signaling activities, oenes encoding proteins involved in the immune resjuinse, 
genes encoding proteins involved in hematopoesis, or genes encoding proteins involved in wound healing. It will be 
appreciated that the target genes are not limited to those spccilically enumoratod above, but may be any gone known to 
be or suspected of being associated with a detectable trail. 

As previously mentionei the maps and markers of tlie present iiwention may be used tu identify genes 
associated with drug response. Accordingly, tlie present invention comprises a method of using a drug cumprisinn 
obtaining a nucleic acid sample from an individual determining the identity of the polymorphic base of one or more 
biollelic markers obtained by tho methods described above whidi is or are associated with a positive response to 
treatment with the drug or one or more biaile5c markers obtained fay the methods described above which is or are 
assodated with a negative response to treatment with the drug, and administering the drug to the individual if the 
nucleic acid sample contains one or more alloles of biallelic markers associated v;ith a positive rcsjionsa to treatment 
20 with the drug or if said nucleic acid sample lacks one or more alleles of biallelic markers associated with a negative 

response to the drug. In some embodiments of the method, the administering step comprises administoring tlie drug to 
the individual if the nudeic add sample contains one or more alte of biallelic markers associated with a positive 
response to treatment with the drug and the nucleic acid sample lacks one or more alleles of biallelic markers assodated 
with a negative response to the drug. 

The biallelic markers of the present invention may also be used to select individuals for inclusion in 
the clinical trials of a drug, By selecting individuals who are likely to rospond favorably to a drug for inclusion in the 
trial, the effectiveness of the drug can be assessed without lowering the measured effectiveness as a result of including 
non-responders or negative rasponders in the clinical trial. May he more importantly, using such selection may avoid 
induding patients who may sulfer from undesirable side effects if administered the drug under trial, thus increasing the 
30 safety of dinical trials. Accordingly, the present invention also indudes a method of selecting an individual for inclusion 
in a dinical trial of 3 drug comprising obtaining a nucleic add sample from an individual determining the identity of the 
polymorphic base of one or more biallelic markers obtained by the methods described abovo which is or are assodated 
with a positive response to treatment with the drug or one or more biallelic markers assodated with a negotive response 
to treatment with tho drug in the nucldc add sample, and induding the individual in the dinical trial if the nudeic acid 
35 sample contains one or more aEfeles of hiallolic markers obtained by the methods described above which is or are 
assodated with a positive response to treatment with said drug or if the nucleic add sample lacks one or more alleles of 
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biallelic markers sssodated with 3 negative response to the drug. In one embodiment of the method, tho inclusiun step 
comprises including the individual ir) the clinical trial if (ho nucleic acid sample contains one or more alleles of biallelic 
markers associated with a positive response tP troatment with the drug and tlic nudcic acid sample lacks one or more 
sllcics of biallelic markers sssodated with a negative response to tho druQ. 

In particular embodiments, one or several of Ihs ApoE linked markers of SEtl ID Nos 301-3GOTQ7-3n or the 
scqusncos carnplcmentarY thereto may be used in targeted approaches to idcnJify individuals who are likuly to develop 
Aldiohncr's diseases, or to identify individuals who do suffer from AUioimcr's dissaso. lu othnr cmbodimonls, iine or more of 
the markers of SEQ ID Nos. 306 and 312 and one or more of the the ApoE linked markers of SEQ ID Nos 30! -305/307-3 1 1 
or the sequences complementary Iherclo are genotyped approaches to idenliiy individuals wlin are likely to ileviOcp 
Alzhoimor's disease, or to identify individuals who do suffer from Abhcimcf's disease. In furihi^r etnbodimants, utiu or several 
of the PG1 linked markers may Lb tested in targeted approaches to identify individuals who aro likely to develop prostote 
cancer, cr to identify individuals who do suffer from prostate cancer. Finally individuals likely to be asthmatic, or aslhmaiic 
individuals, can be idonlified using one or mora of the aslhma-associated markers to conduct the procedures of the present 
tfiventioa 

Given the high number of cancer types in which the PGl clifomosomal region is invuived< it will be appreciated that 
the PGl markers may be employed to identify individuals at risk of deveioping cancers olher than prostate cancer, or to 
identify individuals sufferino from cancers other than prostate cancer. It vAW be further appreciated that thi! astUina- 
associated markers may be tested to idenlify individuals likely 10 exhibit or exhibiting, inflammatory traits other than the 
asthmatic state (e.g, arthritis, or psonasis. among others). The present invention provides adoquate methods to osiabltsh 
associations between markers, such as those mentkined above and candidate traits expressly contemplated hoioin. thus 
legitimating the corresponding targeted approaches to identify individuals Skely to exhibit, or exhibiiing said candidate trnits. 

In some embodiments, the 653 biallelic markers obtained above (which include the sequences of S£D ID Nos. 
1-50 and 51-100 or the scqueftccs complementary thereto) may be used in targeted approaches to identify individuals at 
risk of developing a detectable trait, for example a complex disease or desiredfundBsircd drug lesponsa, or to identify 
individuals exhibiting said traiL The present invention provides methods to establish putative associations between any ol 
the biallelic markers described herein and aity dotectable traits, including those specifically dascribed herein. 

To USE the maps and markers of the present invention in further targeted approaches, biallelic markers which are 
in linkaga disequilibrium wnth any of the above disclosed markers may be identified. In cases where one or more biallelic 
markers of the present invention have been shown to be associated with a detectable trait, mora blallsk markers in linkage 
disectulKbrium with said associated biallelic markers may be genaratod and used to perform targeted approaches aiming at 
identifying individuals exhibiting, or likely to exhibit said d&tactable trait, accordit\g tri tha methods provided herein. 

Furthermore, in cases where a candidate gene is suspected of being associated with a particular detectable trait or 
suspected of causinQ tha datadable trait, biallelic markets in linkage disequitrbrium with said candidate gene may be 
idenilfmd and used in targstod approaches, such as the approaches utilized above for the asthma-associatod gane and the 
ApQ E Qene. 
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Biallelic markers that are in Iinks0e disequilibrium with markers associated with a dstcctabfe trait, or witK genes 
associated with a detectable trait, or suspected of being so, are identifioii by performing sinyle marker analyzes, haplotype 
association analyzes, or linkage disoqiitlibrluni measurcmnnts on samples from trait positive and trait negativo individuals as 
described above using biaKelic markers lying in tfje vicinity of tha target marker or gene. In tliis manner, a single bialleiic 
marker or a group of bjaildic markers may bo identified which Indicate tliat an individual is likely :o possess Ihc detectable 
trait or doos possess the dutcctabla trait as a cansenucnce of a particular allele of the taryat marker or gene. 

Nucleic acd samples from individuab to bo tested for predisposition to a dotectablc trait or possessiutt of a 
detectable trait as a consequence of a particular allele of the target gene may bo examined using the diagnostic methods 
described below. 

Diaannstic Methods 

To USB tho nuaps and biallelic markers of the present iiwcntion to diagnose whether an individual is prodisposod to 
express a dctaciabla trait or whether the individual expresses a detoctablo t/afi as a result of a particular mutation, one or 
more biallelic raarker:; indicative of such a predisposition or causative mutaiian arc idoatificd by performing association 
studies and haplotype analysis on affected and non-affected infividuals as dcscrilicd above. 

Tile diagnostic techniques of the present invention may employ a vaneCy of msthodolooics to datermina 
whether a test subject has a biallelic marker paitein associated with an increased risk of developing a datectabte trait or 
whether the individual suffers froiti a detectable trait as a result of a particular mutatinn, including methods which 
enable the analysis of individual chromosomes for haplotypinQ, such as family studies, single spetin DNA analysis or 
somatic hybrids. 

I Tho trait analyzed using the present diagnostics may be any detectable trait, including diseases, drug response, 

drug efficacy, or drug toxicity, A "positive" drug response may refer to a response indicating cither some drug efficacy 
or no drug toxicity. Diagnostics which analyze drug response, drug efficacy, or drug toxicity may be used to dotermins 
whether an individual should be treated with a particular drug. For example, if the diagnostic indicates a likelihood that 
an individual wilt respond positively to treatment with a particular druo, the drug may be administered in the individual. 
25 Conversely, if the diagnostic indicates that an incSviduai is likeiy to respond negatively to treatment with a particular 

drug, an alternative course of treatment may be prescribed. A negativo response may be defined as either the absence 
of an efficacious response or the prssance of toxic side effects. 

Clinical drug trials represent another appGcatlon for the maps and markers of tho present invention. Ons or 
more markers indicative of drug response, drug efficacy, or drug toxicity may be identified using the tcchniquEs 
30 described above. Thereaftef, potential participants in dinica! trials of the drug may be screened to identify those 
individuals most likely to respond favorably to the drug and exclude those likely to experionca side effects. In that way, 
the effectiveness of drug treatment may be measured in individuals who respond positively to tha drug, without lowering 
the measurement as a result of the inclusion of Individuals who are unlikely to respond postively in the study and 
without risking undesirable safety problems. 
35 In each of the diagnostto methods, a nucleic acid sample is obtained from the test subject and the biallelic 

marker pattern for one or more of the biallelic markers included in the maps of the oresent invention, includinn tha B53 
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biallelic markers obtained above (which include the sequences of S£fl ID Nos. J-50 and 5M00 or the sequences 
complementanf thereto), the asthma-associated biaifefic markers, the PG1 biallolic markers, and tlic Apo E biallcHc 
markers, mcludino those of SEQ ID Wos, 301-305/307-311 or the sequences complementary tboreio. In other 
embodimants, the bjailolic marker pattern of one or more of the markers of SEQ ID Nos. 306 and 312 is diitiinnined in 
addition to tfot9rminirt(j biallelic marker patlcrn of ono or marc of the biallelic markers* included in tim maps of the 
present iaveniicn, including the 653 Unlielic markers chiamcd above (which include tIm suquoncos of SEQ ID Nos. 1-50 
and 5M0O or the sequences complementary tlicretol, the asthma-associalud Liallslic markcrSi the K\ liinllelic 
markcfs, and the Apo E biallelic markers, including those of SEO ID Nos. 301-305/307-311 or the snqufjnccs 
complsmentary thereto, fn soma oinbodimmits. the bisllclic marker pattern is determined by cur:ducting an amplification 
reaction to generate amplicons containinc the polymorphic bases of the one or more biaMc markers in bti yenotyped. 
The idcnties of the polymorphic bases of the one cr more biallslic markers to be analyzed may bu determined using a 
variety of methods, including hybridization assays which spcciiically datGct amplilication products containinQ particular 
alleles of the one or more bialielic markers, and microsequencing reactions which identify the polymnrphic bases of the 
cne or more biallellc markers to he anloyzei 

While the followino discussion utilizes the 653 bialielic markers obrainGd aliove (which include the soquenccs 
of SEQ ID Nqs. 1-50 and 51-100 or the sequences tomplemcntarY theretol the asthma-associatcd bialielic markors, the 
PGl bialielic markers, and the Apo E bialielic markers as examples of the diaQnostics of the present invention, it will bu 
appreciated that the same diagnostics may be used in conjunction v/ith any marker or any tjraup of markers included in 
the maps of the present invention, 

Examples of amplification primers enabling tho amplftication, from subjects genomic DNA samples, of DNA 
fragments that carry each of the markers of SEd ID Nos: V50 and 5M00 or the sequences complementary thoroio, are 
oligonucleotides of SEQ ID NQs: 101-150 and 151-20D; pairs of corresponding primers for a given bialielic marker may 
he reconstituted by choosing the adequate upstream oligonucleatide from SEQ ID Nos. 101 -150 together with the 
corresponding downstream oligonucleotide from SEQ ID Nos: 151-200, 

SEQ ID Nos; 1-50 correspond to tho sequence Identification number for a first allele of the bialielic markers of 
SEQ ID Nos: 1-50 and 51-100 and SEQ ID Nos: 5M00 correspond to the sequence identification number for a second 
allele of the bialMc markers of SEQ ID Nos: t-50 and BMOQ. 

SEQ iO Nos: 313-318 ccrrrespond to sequence identification nurahnrs of upstraam amplification primers 
that may he used to ganerate amplification products containing the polymorphic basas of the bialielic markers of 
respective SEQ ID Nos: 30V3Q6/307.312* SEQ ID Nos: 319-324 correspond to dovynstream amplification primers that 
may be used ta generate amplification products containing the polymorphic bases of the bialielic markers of respectivo 
SEQ ID Nos: 3QV30BI307-312. 

For all markers of SEQ ID Nos; 1-S0I51-10Q and 30V306f307*312 or the sequences complementary thereto, 
the enclosed listings indicate the position find identity of the polymorphic base in each bialielic marker. Potnniia! 
micfosequencing primers are also int^luded in the sequence listing. The sequences of SEQ ID Nos. 201-250 may be used 
in micrasequencing procedures such as those described herein to detarmiae the seauencs of the uolvmorphic bases of the 
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biaiiclic markers pf SEQ !0 Nos. 1-50/51-100, The sequenCES of SEQ ID Nos, 325-330 or 331-336 may be used in 
mfcrosequencing procedures such ss those descrihcil herein Iq determine the sequence of the polvmorphic bases of tha 
biallelic markois of SEQ ID Nos. 301 -3QG/307-312, 

All listings indicate tiie internal identification number corresponding to iha biallelic marker la which the listed sequence 
is related to. 

One Espect of tha present invcmion is a method for delcrmininy whether an individual is at risk nf duvclupinu 
Alzheimer's Disease or whether an individual suffers iram Al/hciincr's Disease as a tonseijueiice of possessing the Apo E 
€4 site A allele. The method involves obtaining a nudeic acid sample from the individual and determining whether tiie 
nucleic acid sample contains one or more markers indicative of 3 risk of dovclopino Abhcinicr's Disaasa or one or mora 
markers indicative that iho individual suffers from Alzlieimer's Disease as a result of possessing ihn Apn E g4 site A 
allele, in ane embodiment, the method comprises determining the identity of the polymorphic base of one or more 
biallelic markers selected from tlie group consisting of SEQ ID Nos. 301-305/3D7-312 or the sequences complcmontarv 
thereto in the nucleic acid sample. In a further embodiment, the method invoives detcrniininQ whether the nuciaic acid 
sample contains the sequence of SEQ ID No, 30S (tha C allele of maikcr 93-2452i54 containing the Apo E e4 situ A 
allele) or the secjuence complementary thereto, In a further embodiment tho method comprises detefmining whether tha 
nucleic acid sampla contains SEQ !D No. 311 (the T allele of marker 99-3(35/344} or the sequence cemplemcntery 
therelOt In an&tlier embodiment, the method comprises determining whether the nucleic add sample contains SEQ ID 
No, 31 1 (tha T atlelfl of markor 99*365f344) and SEQ ID No. 308 (the C allele of marker 89-2452/54 containing tho Apo 
E siiB A allele) or the sequence complGmentary thereto. 

In still a further embodiment, the method comprises deteimiuing wl\etti9r the rucloic acid sanplc contains SEQ 
ID Ho. 202, 201, 303, and 304 or the sequences complemcntar/ thereto. In still aMurtlior embodiment, the method 
ccmprisGS tictermimnQ whether the nucleic add sanple contains SEQ ID Nos, 302, 303, and 3D4 or the senuenccs 
compiementsry Ihoroto. In a further embodiment the method comprises detormining whether the nucleic acid sample 
contains SEQ 10 No. 31 1 (the T allele of marker 93-3B5/344) or the sequence complementary thoroto. 

In some embodiments, the step of determining the identity of the polymorphic base of one or more biallelic 
markers selsctod from tho group consisting of SEQ ID Nos, 301-305 and SEQ ID Nos, 307-311 or the sequences 
complementary thereto in the nucleic acid sample comprises conducting an amplification reaction on said nucleic acid 
sample using one or more of the amplification primers selected from the group consisting of SEQ ID Nos, 313-317 and 
SEQ IQ Nos. 319-323 and determining the identity of the polymorphic base in said one or mora biallelic markers. 

In soma embodiments, tho identity of tha polymorphic base may be dGtermined using one or more of the 
microssquenctng primers listed as SEQ ID Nos, 325-329 or 331-335. In embodiments comprising the step of 
determining whether the nucleic acid sample contains the sequence of SEQ ID No. 30B, the method may comprise 
conducting an amplification reaction on the nucleic acid sample using the pair of amplification primers constting of SEQ 
IQ Nos. 318 and 324. In some ombodiments, the stop of detontiining whether the nucleic add sample contains the 
sequence of SEQ ID 306 comprises conducting a microsequencing reaction using one of the microsequencinQ piimers 
listed as SEQ 10 Nos, 330 or 335, 
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Another aspect of the present inveniion relates to a method of determining whothcr an individual is ot risk of 
develcping a trsit or whether an individual axpresscs a trait os a consequence of possessino a particular trait-causing 
allele. Alternatively, another aspect of tlie present invention relates to a method of rictcrminino whether an individiml is 
at risk of dovDloping a plurality of traits or whether an individual expresses a plurality of traits as a result of possessing 
particular troU-causing alleles. These methods invaive olitaining a nucleic add snmplu from the individual and 
dntcrinining whether the nucleic acid sample contains one or more markers indicative of a risk uf duvclaping the trait or 
one or more markers indicative that the individual expresses the trait as 3 result of pussessing a particular trait-cnusino 
allele. In one embodiment; the muthotls comprise dctcnnimng the identity of tim pnlyumrphic base of one or rnoro 
biallelic markers in the maps of the present invention, inclirding any of the 653 biallelic markers obtained above (which 
indude the sequences d SEQ ID Kos. 1-50 and 5M00 or tha z^wms ccmplamentary thereto), titc asthma-associaied 
biallelic markers, the PGl bialleSc markers, and the new Apo E bialieiic markers. In □ further embodiment, the methods 
comprisD determining the identities of tiis polymorphic bases of at least two, at least three, at least five, at least eight, 
at least 20. at least 100, at least 200, at least 300, at least 400. between 40D and 2.DO0, between 2,000 and 4,000, 
between 4,000 and 10,000, between 1 0,000 and 20,000 or more than 20,000 of the biallelic markers in the maps cf 
the present invention, induding any of the 653 biallelic markers obtained abovi; (which include the soquonccs of SOO Ul 
Nas. 1-50 and 5M00 or the sequences complementary thort}tD), the astlunp-aasociated bialicIic markers, the PG1 
biallelic markers, and the new Apo E biallelic markers. 

In some embodiments, the step of determining the identity of the polymorphic base of one or more biallelic 
markers in the maps of the present invention, including any of the 653 biallelic markers obtained above (which include 
the sequences of SEQ ID Nos. V5D and Sl-IOQ or the sequences complementary Iherctoh the asthma-associated 
biallelic markers, the PGI bialleHc markers, and the new Apo E biallolic markers, comjirises conducting an amplification 
reaction on said nucleic acid sample using appropriate amplification primers and determining the identity of the 
polymorphic base in said one or more biallelic markers. In some embodiments, the identity of the polymorphic base may 
he determined using appropriate mlcrosequencing primers. 

As described herein, the diagnostics may be based on a single biallelic marker or a group of biallelic markers. 
Without wishing to be limited to any particular value, it is preferred that tfie biallelic marker used in single marlcer 
diagnostics either as a positive basis for further diagnostic tests cr as a preliminary starting point for early preventive 
therapy, exhibit a p value in prelimtnary screening association analyzes of about 1x10'^ or less. More preferably the p 
value is about 1 xlO'*or less. 

Similarly, without wishing to be limited to any particular value for diagnostics based on more than one biallelic 
market, it is preferred that the baplotype exhibit a p value of 1x10'^ or less, still more preferably 1 x 10'^ or less and 
most preferably of about 1x10'^ or less in a preliminary screening haplotypc analysis. These values are believed to be 
applicable to any association studies involving single or multiple marker combinations. Significance thresholds may be 
refined according to the methods previously described. 

Example 32 describes methods for delermining the biallelic marker pattern in a nucleic acid sample. 
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Example 32 

A nucleic acid sample h obtained from an individual to bs tested for susceptibility to a detectable trait or for n 
detectable trail causod by a particular mutation. The nucleic acid sampla may be a RNA sample or a DNA snmple. 

A PCR amplification is conducted using primer pairs which Qenorate amplificatinn products containimj the 
pnlymorphic nucleotides of cnu nr more biallolin markers associated wiih such a prodispasition or causativu mutation. 
For example, the amplificatinn products may contain the pulymorphic bases of one or more of the biallelic maikcrs in the 
maps of tiie present inventioa inctudiny any of the B53 biallelic markers obtainiid abov5 (which indudG Ifit; sequences ul 
SEQ [D Kos. 1-5Q and 5M00 or tiic sequanccs complsmontary (heretoh tim asthma-assnciated biuilclic markers, the 
PG1 biallelic markers, and the Apo £ biaileiic markers or biallelic markers in linkage disequliibrium with any »f tiicse 
biallelic markers. In some embodiments, the PCR amplication is conducted using primor pairs which ot^ticrate 
amplification products containing the polymorphic nuclsotides of several biallelic markers. For exairipfe, in one 
embodimem, amplification products containing the polymorphic bases of one or mors biallelic markers in the majjs of the 
present Invention, including any of the 653 bialicllc markers obtained above (wiiich include tlie sequences of SEQ ID 
Nos. 1-50 and 5M0D or the sequences complementary theretol the asthma-assoriatod bialiufic markers, tlie PG1 
biallelic markers, and iha Apo E biallelic markers, biallelic markers which are in linkage disequilibrium thorowitli or with a 
causative mutation associated with a detectable phenotype may be generotei In another ambodiment, amplification 
products containing the polymorphic bases of five or more biallshc markers in the maps of the present invcnlion, 
including any of the the 653 biallelic markers obtained above (which include the sequences of SEQ ID Nos. V5Q and 5V 
100 or the sequences complementary thereto), the asthma-associated biallelic markers, ihc PGl biallelic markers, and 
the Apo E biallelic markers, biallelic markers which are in Cnkaga dtsequilibriurn thor\iwith or with a cnusative mutation 
associated with a detectable phenotype may be generated. In another embodiment, amplMication products comainino the 
polymorphic bases of 20 or mare biaileiic markers in the maps of tfm present invention, including any of the 653 biallelic 
markers obtained above (which Include the sequences of SEQ ID Nos. 1-50 and 5M0O or the sequences complementary 
thereto), the asthma-associated biallelic markers, the PGl biallelic markers, and the Apo E biallelic markers, biallelic 
markers which are in linkaga disequilibrium therewith or vA\\\ tha causative mutation may be generated. In another 
embodiment, amplification products containing the polymorphic bases of 1 DO or more biallelic markers in the maps of the 
present invention, including any of the the 653 biallelic markers obtained above (which include the sequences of SEQ ID 
Nos. 1-50 and 5M0Q or the sequences complementary Ihereto). the asthma-associated biallelic markers, the PGl 
biallelic markers, and the Apo E biallelic markers, bialferic markers which are in linkage disequilibrium therewith or with a 
causative mutation associated with a detectable phenotype may be generated. In another embodiment, amplification 
products containing the polymorphic bases of 2QQ or more biallelic markers in the maps of the present invention, 
including any of the the 653 biallelic markers obtained above {which include the sequences of SEQ iO Nos. 1-50 and 51- 
ICQ or the sequences comptemefttaiy iherEto), the asthma-associated biallelic markers, the PGl biallelic markers, and 
the Apo E biallelic maikers, biallelic markers which are in linkage disQQuitibrium therewith or with a causative mutation 
associated with a detectable phenotype may be generated. In another emboiSment, amplification products containing the 
polymorphic bases of 300 or more biallelic markers in the maps of the present invention, includino anv of the 653 
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biallelic markers obtained above Iwhrch indude the sequencos of SED ID Nos. 1-50 and 51*100 or the stiqucnccs 
complementary thereto), the asthma-assQcisted bialfGlic markers, the PG1 biailctic markers, and thu Apo E bialielic 
markers, hiallEiic markers which aic in linkage disequilibrium tljcrewith or with the causative mutation may be 
generated. In another cmbniiiment amplification products containing the polymorphic bases of 400 or more bialielic 

5 markers in the maps of liic present inv&ntion, including any of the the C53 bialielic markers obtairieif above {which 

include the sequences of SEQ ID Nos. 1-50 and 5M00 or the sequences complementary thereto], the asllima-associated 
biatlelic markers, the PG1 bfaSelic markers, mi the Apu E bialielic marki^rs, bialielic markers wluch are in iinkayi! 
disequilibrium therewith or with a causative mutation associated with a delectable phenolypc may be ycneroted. 

The primers used to generate the amplification products may be designed as described iieruin. ReprcsenUUive 

10 amplification printers for ncneraling amplification products containinQ the polymorphic bases of the bialielic markers of 

SEQ ID Nos. 1*50 and 5M0O are provided as SEQ ID Mas. 10M5D/151-200 in the accompanying Sequence Listiny. 
The pen primers may be oligonucleotides of 10. 15, 20 or more bases in fcngih which enable the ariiplification of the 
polymorphic site in the markers. In some embodiments, the amplification product produced using these primers may be 
at least 100 bases in iength (i.e. about 50 nucleotides on each side of the polymorphic base). In other ombodijricri ts, the 

15 amplification product prnduccd using these primers may be at least 500 bases in length [i.e. about 250 nucleotides on 

each Side of the polymorphic base), in still further einbodiments, the amplification product produced usino these primers 
may be at least 1000 bases in length He. about 50Q nucleotides on each side of the palymorphic base). 

Table 9 lists the internal idontification numbers of the 50 localized markers described herein and the Apo E 
markers described herein, the SEQ ID Wos, for each of the two allales ol these bialielic markers, the SEO ID Nos. of 

2D representative upstream and downstream amplification primers which can fao used to generate amplification products 

including the polymoiphic liases of these bialielic markers, and the SEQ 10 Nos ut microseqQcncino primers which can be 
used to determine the identies of the polymorphic bases of these markers. 

Table 10 

Marker SEQ ID Nos SEQ ID Nos SEQ ID Nos 

25 IGensetcode) First Sacond Amplification primers Wicrosequcncinfl primers 

allele allelo UpsUaam Downstream ? 2 
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99-2647 49 99 149 199 249 299 

93'2G49 50 100 150 200 250 300 

k will be ajiprecialed that the primers listed in Tabia 9 are merely Gxemplpry and that any other set of prmicrs 
which produce amplification products containing the polymorphic nucleotidos of one or mora of the bioHalic markurs of 
SEQ ID Uns: 1-50 and 51400 or biallelic markers in linkage disnqnifibrium therewith or with a cntisative mulatinn fur a 
detectable troit or a combination tlieruof may be used in the diaynosiic nmihuds. It will also be apprcciaied tlial thass 
diagnostic mctliods may be periormad with ony biallelic niarkcr or combination of bialfelic markers included in the map$ 
of the present invention. 

Fodowing tlie PCR amplification, the identities of the polymorpiiic bnsos of one or more of the biallelic markeis 
in the nucleic acid sample are determinod. The idcnlitics of tho polymorphic bases may be deteiminud usinj) the 
microsoqucficing procedures dascribod in Eiample 13. It will be appreciated that the microsequencino primers listed as 
SEQ ID NOs: 201-250 and 251-300 are merely exemplary atid that any priincr having a 3' end near ttia polyniorpliic 
nuclaatide, and preferably immediately adjacent to the polymorphic nucleotide, may be used. Similarly, it will be 
appreciated that microscquencing analysis may be pGriarmed for any marker or combination of markers in the maps of 
the present invcrttioa 

Alternatively, tha microsBquencing analysis may be performed as described in Pastinon ct aL, Gcnoma 
Research 7:606-614 (1997), the disclosure of which is incorporated herein by reference, and which is described in more 
detail below. 

Alternatively, the PCR product may be completely sequenced to determine the identities of the pulymtjr(ihtc 
bases in the bialielic markers, in another method, the identities of the polymorphic bases in the biallelic markers are 
determined by hybridizing the amplification products to microarrays containino allele spccijic oliynonucleotides specific 
for tha polymorphic bases in the biallelic markers. The use of microarrays comprising alleia specific oliQonucleotidos is 
described in mora detail below. 

it will be appreciated that the identities of the polymorphic bases in the biallolic markers may be doiarmincd 
using techniques other than those listed above, such as conventional dot blot analyzes, 

Nucloic acids used in the abovo diagnostic procedures may comprise at least 10 consecutive nucleotides, 
including the polymorphic bases, of the biallelic markers in tha maps of the present invention, including any of the B53 
biallelic markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 5M00 or the sequences 
CQmplementary thereto), the asthma-associated biallelic markers, the PG1 biailcSic markers, and the new Apo E biallelic 
markers, including timse t)f SEQ ID Nos. 301-305(307-311 or the sequences complementary thereto. Altornatively, Ihc 
nucleic acids used in tha above diagnostic proceiures may comprise at least 15 consecutive nucloctidcs, including the 
polymorphic bases, of tiie biallelic markers In the maps of the present invention, including any of the 653 biallelic 
maikers obtained above (which include the sequences of SEQ ID Nos, 1-50 and 5M0Q or the soquences complementary 
thereto), the asthma-associated biallelic markers, the PGl biallelic markers, and the new Apo E hialleiic markers, 
including those of SEQ ID Nos. 3D1-3Q5/3D7-311 or the sequences complementary thereto, In some embodiments, the 
nucleic acids used in the above diagnostic procedures may comprise at least 20 consecutive nucleotidas, including the 
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polyinDrpfiic bases, of the bialleJtc markers in Hie maps cf tho present inveiuioi, including any of the 653 bialleiic 
markers obtained above (which include thfl sequences of SEQ ID Nas. 1-50 and 5M00 or the sequences comploniuntnry 
thereto), the asthma-associatod hiaflelic markers, the PGl biollclic markers, and the new Apo E biallelic markers, 
includino those of SEQ ID Wos. 30V3D5/307-31 1 or the sequences cornpIementarY ihcrelo. h still oihtu embodiments, 
5 the nucleic acids used in the a&ovs diagnostic procedures may coniprisc at least 3Q consecutive nucleotides, incluiiiinj 
the polymorphic basps, of the faiaOnlic markers in the maps of thi: present invunlion, including any of the 653 liinlielic 
markers obtained above (which iucludc the sequences of SEQ ID Nos. V50 and 51-100 or the suquences complcfnuntary 
therctoh tfiu asthma-associatod bialielic niarkcfs, tha PGl binllelic markers, and the now Apo E biallolic markers, 
includino these of SEQ ID Nos. 3DV305i307-31 1 or tlw sequences ccmplcmudtary thereto. \n further ombodiinents, the 
10 nucleic adds used in Ihe above diagnostic procedures may comprise more than 30 consecutive micleotidcs, including the 

polymorphic bases, of the biallGlic markers in tf»c maps of the present invention, including any of the the C53 bialielic 
markers obtained above [which include the sequences of SEQ ID Nos. 1-5Q and 5MO0 or Ihu sequences complementary 
thereto), tha asthma-associated bialielic markers, the PGl biallDlic markers, and the new Apo E bialielic markers, 
including those of SEQ ID Nos. 301-30Si307-31 1 or the sequences cumplemcntary thereto. In still further cmhodimcnts, 
15 the nucleic acids used in the above diaQnostic procedures may comprise the oiiiire sequence of the bialielic markers in 

the maps of the present invention, including any of tJie the 653 bialielic markers obtained above (which include the 
sequences of SEQ ID Nos. 1-50 and 51-1 DO or the sequences complementary thereto), the asthma-associated bialJcIic 
markers, the PGl biallolic markers, and the new Apo E bialielic markers, including those of SEQ ID Nus. 301-305/307* 
31 1 or the scquencos camplernentary tlicreto. In some embodiments the nuclGic acids used in the diapostic procedures 
20 ore longer than the sequences of SEQ ID Nos. 1-50, 5M00, 301-305 and 307-11 because they contain nucleotides 

adjacent to these sequences. * " 

The diagnostics of the present invention may also employ nucleic acid arrays attached to DNA chips or any 
Other suitable solid support, including beads. As used herein, the tcfm array mca«s a one dimensional, two dimensional or 
multidimensional arrangement of a plurality of nucleic adds of sufiicient length to permit specific detection of nucleic acids 
25 capable of hybridizing thereto. 

DNA chips allow the integration of micro-biochemical processes (such as DNA hybridization), systems of signal 
detection (such as iluorsstence) and data processing into 3 single system which can be used to ohiafn information on 
poiymoiphism. The solid surface of the chip is often made of silicon or glass but it can be a polymeric membrane. 
Efficient access to palymorpliism information is obtained through a basic structure comprising high-density arrays of 
30 oligonucleotide probes atteched to a solid support [the chip} at selected positions. The immobilization of arrays of DfJA 

probes on solid supports has been rendered possible by the devslopment of a technology generally identified as "Very 
Large Scale ImmohiGzed Polymer Synthesis* {VLSIPS"^) and in which, typically, probes are immobilized in a high density 
array on a solid surface of a chip. Examples of VLSIPS™ technologies are provided in US Patents 5,143,854 and 
5,412,087 and in PCT Publications WO 90115070, WO 92/10032 and WO 95/11395, the disclosures of which are 
35 incorporated herein by reference, which describe methods for forming oligonucleotide arrays through techniques such as 

Itpht-directed synthesis techniques. 
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\n designing strategies aimod at providing arrays dJ nucleotides immobilizod on solid supports, further 
presentation strategies were developed to order ond display the probo arrays on tho chips in an attempt to maximize 
hybridization patterns and sequence information. Examples of such presentation strategies aro disclosed in PCT 
Publications WO 34/12305, WO 04/11530, WO 97/20212 and WO 97/3125G, the disclosures of which <nrc incorporated 
5 herein by reference. 

Each DMA chip can contain thousands to millions of iwiividual synthetic ONA proLus arranged in a yrid-Iikc 
pattorn and ttmnalurized to the si^o of a dime. 

The chip tBchnoiogy has bcun successfully used to detect mutations in numerous cases. Fnr example, the 
scresnlng of mutations has been undurtaken in tlic DRCAl gene, in S, csmisisa mutant strains, ?nd in the protease 
ID gsno of HIV-1 virus (see llacla ct ah, NaL Genei 14:441447(1996); Shoem;jker et al.. Nat. Genet 14:45045Q {19SG); 

Kozal ct aU I^OL M$d 2:753-759 [1336L the disclosures of which arc incorporated herein by rcfsrance). At lenst three 
companies propose chips able to detect biallcfc polymorphisms: Affymetrii (GeneChip). liyseq (HyChip and llyGnustics], 
and f'rotogenc Laboratories. 

In some embodimoms, tlie efficiency of hybridization of nucleic acids in the sample vjith the probes auaclied to 
15 the chip may bo improved by using polyacrylamide gel pads isolated from one another by liydrophabic rogicns in wi\ich 

the DMA probes are covolentiy linked to an acrylamide matrix. 

The polymorphic bases present in the biallclic n:i3rk3r or maikcrs of the sample nucleic acids are determined as 
foIlov\/s. Probes which contain at ieast a portion of one or more of the biallelic markers of the present invention are 
synthesized cither //7^;/£/ or by conventional synthesis and immubiiized on an appropriate chip usiny methods known to 
20 the skilled technician. 

The nucleic acid sample which includes the candidate region to be analyzsTl is isolalGd, amplifiod with primers 
capable of generating an amplification product containing the polymorphic bases of one or more biallclic markers, and 
labeled with a reporter group. Tho reporter group can be a fiuorescent group such as phycoerythrin. The labeled nucleic 
acid is then inctibatcd with the probes immobilized on the chip using a fiuidtcs station. For example, Manz ei al, [Ayd. in 
25 ChromBtogr. 33:1-BB {1983), the disclosure of which Is incorporated herein by reference) describe the fabrication of 

ftuidics devices and particularly miaocapillary devices, in silicon and glass substrates. 

After the reaction is completed, the chip is inserted into a scanner and patterns of hybridization are detected. 
The hybridization data is collocted as a signal emitted from the reporter groups already incorporated into the nucleic 
acids generated in the amplification of the sample DNA, which is now bound to the probes attached to the chip. Probes 
30 that perfectly match a sequence of the nucleic acid sample generally produce stronger signals than those that have 

mismatches. Since the sequence and position of each probe immobilized on the chip i$ known, the identity of tho nucleic 
acid hybridized to a given probe can be determined. 

For single-nucleotido polymorphism analyzes, sets of four niigonucleotides are generally designed (one for each 
possibta base) that span each position of a portion of the candidate region found in the nucleic acid sample, differing only 
35 in the identity of the central base. The relative intensity of hybridization to each series of probes at a particular location 

allows the identification of the bass corresponding to the centra! base o! the probe. For example, to detect sinqla 
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nudeotide polymorphisms such as those in tha present biallelic markers, Qlioo^^ucieotides having each of tl)Q two allslic 
bases at their central positioa are affixed to tho chip. The ajiiplification products rssulting from amplification of the 
nucleic acids in the sample ore hybridized to the chip under high stringency (at lownr snit conconlraliori ond higher 
tompcralurc over shorter time periods) to facilitate specific dtitcclion of tho poiymorphic sequences present in tiio 
auclcic scid sample. 

The liSQ of direct electric field control jmprovcs the determination of singlo basu mutations (Wjiniiy^ii). A 
positive field increases tiie tronsport rate of ncyatively charged nucleic acids end results in e lO-foId iiicrcnsu of the 
hybridizatinn rates. Using this technique, sinyle base pair mismnlclius ere datectod in less than IH sue (see Sosnowski ct 
al, Ffvc, NotL Ac$± ScL USA 94:1119.1123 (1997K the disdosufe of which is incorporated heruiii by reference). 

Another technique which can be used to analyze polymorphisms includes multicoinpancnt integrated systems 
which miniaturize and compartmentaOze processes such as rostrictiDfi enzyme digestion, PCR reactions, and capillary 
electrophoresis in a single functionel device. An example of such technique is disdnsed in US patent 5,589,136, the 
disdosure of which is incorporeted herein by referencG, which concerns the intCQration of PCR amplification and 
capillary electropltorcsis in chips. IntCQraled systems are best applied with microfluidic systems. These systems 
comprise a pattern of microciiannels designed onto a glass, silicon, quanz, or plastic wafer included on a microchip, The 
movements of the samples are controlled by electric farces applied across differont areas of tho microcliip to create 
functional microscopic valves and pumps with no moving parts. Regulating or varying the vottage controls the liquid flow 
at intorsectians between the micro-machined channels aad changes the liquid flow rate for pumping acioss different 
sections of the microchip. 

In the case of biallelic marker analyzes, tha micro-chip inlcQiates nucleic acid amplincetion, a microsequuncino 
reaction (such as the one described above), capillary electrophoresis and a ilctcctioft method such as loser-induced 
fluorescence detection. 

In a first step, the DNA samples are amplified, preferably by PCR. Then, the amplification products are 
subjected to automated microsequencing reactions using ddNTPs (specific fluorescence for each ddNTP) and the 
appropriate ollgonuclsotide microsequencing primers which hybridize just upstream of the targeted polymorphic base. 
The microsequencing reactions may employ primers capable of being extended to the polymorpiiic bases of the bialleiic 
markers. Preferably, the microsequencing primers comprise a sequence terminating St the base immediately preceding 
the polymorphic base of tho biallelic markers. Once the extension at the 3' end is completed, tho primers are scporated 
from the unincorporated fluorescent ddNTPs by capillary electrophoresis. The separation medium used in capillary 
electrophoresis can for example he polyacryiamide, polyothyienoglyco! or dextran. The incorporated ddNTPs in the sinole- 
nudeolide primer extension products are Identified by fluorescence detection. Preferably, the micro-chip can be used to 
process at least 96 samples in parallol. Mors preferably, the micro-chip caa be used to process at least 3Ei4 samples in 
parallel. Preferably, the microchip is designed for use with detection procedures using four color laser induced 
fluorescence detection of the ddMTPs. 

Any ona or more alleles of the biallelic markers in the maps of the present invention, or Iragments thereof 
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containlng the polymorphic tasos, may be fixed to a solid support, such as a micrachip or other immobilizmg surface. Ilia 
fragments of these nucleic acids may comprise at least 10, at loasl 15, ot least 20. ot least 25, or mnre ihnn 25 
consecutive nucleolides of the biallslic markers described hercia Prefgrably, the fragments trtclude ihc paiymorphi: bases of 
ttic biallcHc markors. 

A nucloic acid sample is appBed to the itnmobilizinii surface and aaiilyzud to dctormine the itlontius uf the 
polymorphic bases of one or more of the biallclic markers, hi sonic cnibodintcnts, the solid suppert may also include one or 
more of amplification primers dosaibod herein, or fraomcnts comprising at least 10, at least 15, or at loast 2(3 
consecutive nucleotides theraof, for generating an amplification product containing ihu polymorfiliic bases of the bia!!eiic 
markers to be analyzed in the sample. 

Another embodiment of the present invention is a solid support which includes ime or mora of tho microseiiuenci/iy 
primers listod as in the accompyifiQ Sequence Listing, or fragments comprisino at least 10, at least 15, or at least 20 
consecutive nucleotides thereof and having a 3' temiinus immediately upstream of tlie polymorphic base of the 
CDrresponding biallelic marker, for determining the ideniily af the polymofphic base of the one or more biallslic markers fixnd 
to the solid support. 

For example, one embodiment of the present invention is an srray of nucleic acids fixed to a solid suppoit, such as 
a microcfiip, baarf, or other immobOizing surface, comprising one or more of the biallehc markers in the maps of the present 
invention or a fragment comprising at loast 10, at least 15, at least 20, at least 25, or more than 25 consecuUvc nucleotides 
thereof including the polymorphic base. For example, the array may comprise one or mare of any of the 853 biallehc 
markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 51-100), the asthma-associatod Ijinlldic 
20 markers, the PGl bialieiic markers, and the new Apo E biallelic markers (including S£Q ID Nos. 301-305/307-31 1} or ihs 

sequences complementary thereto, or a fragment comprising at k?ast 10, at least 15, al least 20, at least 25, or more than 
25 censecutive nucleotides thereof including the polymorphic base. In a further embcdiment, the array comprises at least 
five of the biallotic markers in the maps of the present invention or a fragment comprising at least 10, at least 15, at least 
20, at least 25, or more than 25 consecutive nucieoiidcs lUereof including the polymorphic base. For example, the arrays 
25 may comprise at least five of any of the G53 biallelic markers obtained above (which include the sequences of SEQ ID 

Nos. 1-50 and 5M0O), the asthma-associated biallelic markers, the PGl bialieiic markers, and the now Apo E biallelic 
markers {including the sequences of SEQ ID Nos. 301.305/307^311) or the sequences complementary thereto, or a 
fragment comprising at least 10, at least 15, at least 20, at feast 25. or more than 25 consGcutive nucleotides thereof 
including the polymorphic bass. In a further embodiment the array comprises at least 10 of the biallelic markers in the 
maps of the present invention or a fragment comprising at least 10, at least 15. at least 20, at least 25, or mors than 25 
consecutive nucleotides thereof including the polymorphic base. For cKampte, the array may comprise at least 10 of any of 
the B53 biallelic markers obtained above (which incliide tha sequences of SEQ ID Nos, 1-50 and 5M0Qt, the asthma- 
associated biallelic markers, the PGl bialieiic markers, and the new Apo E btalleHc markers (including the sequences of 
SEQ ID Nos. 30 1-305/307-31 1) cf the sequences complementary thereto, or a fragment comprising at least 1 D, at least 1 5, 
at least 20, at least 25, or more than 25 consecutive nucleotides thereof including the polymorphic base. In a further 
embodimEnt the array comprisas at least 20 of the biallelic markers in the maps of the present invention or a f rapment 
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comprising at feast 15 cansenutive nucleotides thereof including the paiymorphic base. For sxampie, (he array may comprise 
at least 20 of any of the G53 liialMic markers obtained above (whicii inciude tho sequences of SEQ ID Nos. 1-50 and 51- 
100), the asthma-assaciated Liallelic markers, the PG1 bialleltc raarlters, and the new Apo E biallelic markers (including 
the sequencos of SEQ ID Nos. 301-305(307-311) or the sequences complomuntary thereto, or a Iraoment cornprisino at 
least 10, at least 15, at least 20, at leost 25. or nwre than 25 consecuiivu nucleoiidcs liiuroof includiny tiic polymorphic 
base. In a further Djubadimnnt the array comprises at least 100 of tho biatlcllc markers in the maps of the priistyit 
invention or a kaijiuent cornprisino al least 10i at laast 15, at least 20. at least 25, or more than 25 consecutive iiuclcaiidss 
thorcDf including the polymorplirc base, For example, the array may caniprisc at least 100 of any of the G53 biallolic 
markers obtained above [which include the sequences of SEQ 10 fJos. 1-50 and 51O00), tho asthma^assaciated bialtclic 
markers, the PGl biallelic markers, and the new Apo E biallelic markers [including the sequences of SEQ ID Ntis. 301- 
305/307-311) or the sequonces cornptonontary theroto, or a fragment comprising at least 10, at least 15, at least 20, at 
least 25/ Of more than 25 consecutive nucleotides thereof including liie polymorphic base. In a further ctnhodiment the 
array comprises at least 200 of the biallelic markers in tha nnapi of the present invention or a fragment timreoi comprising 
at least 10. at least 15, at least 20, at least 25, or more than 25 consccinive nuclaolides thereof including the polymorphic 
base. For example, the array may comprise at least 200 of any of the G53 biallelic markers obtained above (which include 
the sequences of SEQ ID Nos, 1-50 and 5M0OI the asthma-associated biallelic markers, the PGl biallelic marlcers, and 
the new Apo E biallelic markers (including the sequences of SEQ ID rios. 301-305/3D7-3111 or tl\c scciucnces 
complementary thereto, or a fragment comprising at least 10, at least 15, at least 20, at least 25, ar more than 25 
consecutive nucleotides thereof inciudino the polymorphic basE. In a further ombodimcnt the array comprises at least 300 
of the biallelic markers in the maps of the present invcn^on or a fragment comptising at least 10, at least 15, at least 20, at 
least 25, or more than 25 consecuiivo nucleotides thereof including the polymorphic base. For example, the array may 
comprise at least 300 of any of the 653 biallelic markers obtained above (which include the sequDnces of SEQ 10 Nos. 1- 
50 and 51-100), the asthma-assQciated biallelic markers, the PGl biallelic markers, and the new Apo E biallelic markers 
(including the sequences of SEQ ID Nos. 301-3051307.31 11 or the sequences complomentary thereto, or a fcagmenc 
25 comprising at least 10, at least 15. at toast 20, at least 25, or more than 25 consecutivo nucleotides thereof inciudino the 

polymorphic base. In a further embodiment the array comprises at least 400 of the biallelic markers in the maps of the 
present invention or a fragment camptising at feast 10, at least 15, at least 20, at least 25, or moro than 25 consecutive 
nucleotides thereof including the polymorphic base. For example, the array may comprise at least 400 of any of the 653 
biallelic markers obtained above (which include the sequences of SEQ ID Nos. 1-50 and 5M00), the asthma-associated 
3D biallelic markers, the PGl biallslic markers, and the new Apo E biallDiic markers (including the sequences of SEQ ID Nos. 

301-3051307-31 1) or the sequences complomontary thereto, cr a fra([nnont comprising at bast 10, at least 1 5, at least 20, 
at least 25, or more than 25 consecutive niit;leotides thereof including the polymorphic base- In a further embodiment the 
array comprises more than 400 of the hiallfib'c rTjarkw*s in the maps of the present invention or a fragment comprising at 
least 10. at least 15, at least 20, at least 25, or more than 25 consecutive nucleotides thereof including the polymorphic 
35 base. For example, the array m^y comprise at least 400 of any of the 653 biallelic markers obtained above [which include 

the sBPuences of SEQ 10 Nos. 1-50 and 51-1001. the asthma-associated biaMc markers, the PGl biallelic markers, and 
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the new Apo E biallBlic markers (including the sequences of SEQ ID Nos, 30r3D5/3D7-3111 or the sequences 
complGmentary thereto, or a fragment comprising at laas: 10, at least 15, at least 20, at toast 25, or moro than 25 
consecutive nucleotides thereof indudjng the polymorphic base. Each of tho embodiments listed above may also include one 
or more of the sequences af SEQ 10 Nos. 305 and 312 in addition to those enumerated above. 

Another cmbndimem of the present invention is an array comprising amjililication primers Far (jerieratinfl 
amplification jiroducts containinfl tlm puIyriiDrphic bases of one or inorc, at least five, at iiinst 10, at least 20, at least 100, 
at (east 200, at least 300, at least 400, or mofc than 400 of thu iiiaOelic markers in if]e maps of the presHfU invention. For 
example, the array may comprise amplification primers for QcncratinQ amplification products coalaituay the polymoipliic 
bases of one or more, at least ftve, at least 10. at loast 20, at least 100, at least 200, at least 3G0, at (oast 400, or more 
than 400 of any of the 653 biallolic markers obtained above (which include the sequences of SEQ ID Nus. 1-50 and 51- 
IDQ or the sequences complomentary thereto), tha asthma-associated bialldic markers, the PGl btalltilic markers, and 
the new Apo E bialtelic markers (including the sequences of SEQ ID Nos. 301-305/307*311 or the sequences 
complementary thereto]. In such arrays, the ampfification primers included in the array are capable of amplilyinu the 
biatlelic marScor sequences to ba detected in the nucta'c acid s;3mnle applied to tha array (i.e. the amplification primers 
correspond to the biallelic markers affixed to the array). For example, if the array is designed to delect the biallelic marker of 
SEQ ID Ncs. 1 and 51 it may also contain SEQ 10 Nos. 101 and 151 the amplification primers capable of generating an 
ampiicon which includes sequence ID Nos. 1 and 51, Thus, the arrays may include one cr more of the amplification primers 
of SEQ 10 Nos J 01*200, 313-317, and 319-323 couasponding Id the one or more biallelic markers of SEQ ID No.s. 1-50, 

51-100, 301-305, ami 307-311 which ire included in the array. In ether embodiments, the arrays may includa 
amplification primers capable of generating an amplificatian product which mcluilas the biallsiic markers SEQ ID Nos. 
306 and 312 in addition to OTplitication primers capable of generating an ampnfication product containing each of the 
markers enumerated above. Thus, in such embodiments, the arrays may further include the amptification primers of SEQ 

[Q Nos. 318 and 324. 

Another embodiment of the present invention is an array v;hich includes microsequencing primers capable cf 
dotermiaing the identity of the polymorphic bases ene or more, at least five, at least 1 0, at least 2D, at least 1 00, at least 
200, at least 300. at least 400, or more than 400 of the biallelic markers in the maps of the present invention. For 
example, the array may comprise m;crosequ?ndng primers capable of determining the identity of the poiymorphic bases of 
one or more, at least five, at least 10, at least 20, at least IQQ, at feast 200, at least 3D0, at least 400, or mon^ than 400 
of the 653 biallelic markers obtained ahove (which include the sequences of SEQ ID Nos. 1-50 and 51-100 or the 
sequences complEmentary thereto), the asthma-associated biallelic markers, the PGl biallelic markers, and the new Apo 
E biallelic markers (indudina the sequences of SEQ ID Nos. 301-305/307-311 or the sequences cDmpicmantary thorclo). 
The sequences of representative microsequencing primers which may be included in the array are listed in the sequence 
listing as SEQ ID Nos. 201-30O, 325-329, and 331-335. In other embodiments, the arrays may further include 
microsequencing primers for determining the identity of the pD^ymorpbic bases o! one or more of the sequences of SEO 
ID Nqs. 305 and 312, such as the microsequencing primers of SEQ ID Nos. 330 and 336, 

Arrays containing any combination of the above nucleic acids which permits the specific detection or 
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identincatidn of the polymoFphic bases of the biallclic msrkers m the maps pf ths present invention, including any 
combination of the 653 liiallelic markers obtained above [which include iho sequences of SEQ ID Nos. 1-50 and 51-100 
Of tha sequences compIemGntarv thcrctoL the asthma-ossociatod biallclic markers, the PGl biallelic marVcrs. antl the 
new Apo E bialiolic markers (iflclutling (he sequences of SEQ 10 Nos. 301-3051307.311 or Hid sequences complamontary 
Ihcrclo) are also within the scope of the present invention, Otiicr cmboifimcnts af the arrays include nuchiic acids which 
permit the specific detection or identification of the polymorphic bases of one or more of SEQ ID N«s. 306 and 312 in 
addition to the nucleic ^ciils permitting the specific detection or idcntication of the pulymorpluc bases of ihc biallelic 
markers listed in the prccedino sentence. For uxumple, the array may cumprise both the biallclii: markers ajid 
amplification primers capable of generating amplificaiian products cuntaining the polymorphic bases of the biallelic 
markers. Alternatively, the array may comprise both amplification primers capable of floneratiny amplification priulucts 
containing the polymorplHC bases of tha bialleiic markers and inicrosequoncing primers capable of detemining the 
identities of (lie polymorphic bases of these markers. 

Although the above examples describe arrays comprising specific groups of biallelic markers and, in scmu 
embodiments, specific amplification primers and miaosequencing primers, it will be appreciated that the prosent 
uivention encompasses arrays including any btalleBc marker, group of biallclic markers, amplification primer, Qroup o! 
amplification primers, microsequenctng primer, or Qtoup of amplification primers described herein, as well as any 
combination of the preceding nudoic acids. 

Alternatively, the microsequcncing procedures described above may be used to determine whether m individual 
possesses a pattern of biallclic marker allefcs associated with a detectable trait. In this approach, a PCR reaction is 
performed on the ONA or RNA of the individual to be tested to amplify the desired biallelic markers or portions thereof. The 
amplification product is hybridized to one or more oligonudcotidcs having their 3' end one^base from the posilion of the 
polymorphic basos of the bialteBc markers which are fixed to a surface. The oligonucleotides arc extended one base usin^j a 
detcctably labeled dNTP and a polymeiase. Incorporation of a pattern of detcctably labelod bases indicative of a hialluJic 
marker pattern associated with a detectable trart indicates that the individual suffers from a detectable trait as the result of 
a particular mutation or that the individualls at risk for dflveioping the detectable trait at a subsequent time. 

In addition tp their use in diagnostic techniques such as those described above, any of the arrays described above 
may also be used to identify a haplotype O.e. a set of alleles of brallellc markers) which is assodated with a particular trait. 
As described above, in such analyses i^utlslc acid samples are obtained from trait positive and treit negative individuals and 
the alleles of biallelic markers present in each population are determined to identify a haplotype which is statistically 
associated with the trait. The arrays may be employed in haplotype analyses as follows. Nucloic acid samples obtained 
from trait positiva and trait noQativs IndlvlduaU are amplified with prirr^ors capable of gonorating ompirfication products 
which include the polymorphic bases of tha biallelic markers. The amplification products are labeled with a reporter qroup 
and allowed to contact the biallelic marker probes which are attached to the support. As described above, the biallelic 
marker probes to which the labeled amplification products specrficaity hybridize are determined to indicate which alleles of 
the biallelic markers are present in the samples. Tha patterns of alleles of biallelic markers in the trait positive and trait 
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negalJvG individuals are thou detennincd to identify a haplotype having a statistically significant association with iho trait. 

Wternativety, as described above, the nucleic acid samples from trail positive and trait nogative individuals may lie 
applied to sn array comprising amplification primers capable of gencraling ainplilication products which inciudi! tim 
polymorphic bases of tlic bialleJic markers. Tfic idantilies of tlie polymorphic bases in the amplification products m then 
5 determined using teclmiques such as the microscqiiGncing proceduros discliiscd herein, Altcmntivaly, amplificnttan con be 

conducted in (iquid phase and microseauancing may be conducted on the array. 

Allornativdy, both amplification and microscquencing reactions may be pcrfonned in liquid phase. In sucfi 
embodiments, the labeled nucteolides incofporateJ in Hhj microscquDjicing primers duriiig thu microscquejininy reactions are 
detoctcd by hybridising the extended microscquencing primers to sequcncss complancntary to the niicrosequoncing pfiniei s. 
10 Tluj sequences complementary to the mlcrosoquencing primors are immobilized on a support, such as those described above. 

The amplification and microsequencinQ reacUans performed in liquid phase may he muilipSexed. allowing the samples to be 
tested simultaneously for tens, hundreds, thousands or more biallefic markers. 

Preferably, the array used in tho haptolype analysis comprises one or more groups of biallelic markers known to bs 
located in proximity to one another in the genome. For example, tlic biallelic markers in the groups may be derived from 2 
15 single YAC insert, a single BAG msert or a BAC subclone. Allemntively, tha biallelic markers in the groups may be derived 

from adjacent ordered clones. Tho bialioiic markers in the groups may he located within a genomic region spanning less than 
Ikb, from 1 to 5kb, from 5 to lOkb, from 10 to 25kli, from 25 to 50kb, from 50 to 150kb. from 150 to 250kb. from 250 to 
SOOkb, from SDOkb 10 1Mb. or more than \Nlh. In some embodiments, the biallelic markers in the groups comprise liinllclic 
markers which have been localized to the same chromosome, subchromosomal region, or gene. 
20 It wiO be appreciated that the ordered DNA containing the biallefic markers need not completely cover Ific genomic 

regions of these lengths hut may Instead be incomplete contigs having one or more gaps ttterein. 

In some embodiments, tha biallolic markers known to be located in proximity to one enother in the genome may bo 
located in physical proximity on the array. For example, the array may comprise one or more groups of at least 3 biallelic 
markers knovyn to he located in proximity to one another in the genome. In some embodiments, the array may compiise one 
25 or more groups of at least 6 hialldic markers known to he located in proximity to one anolhor in the genome. In other 

embodimentSf the array may comprise one or more groups of at least 20 biallelic markers kmm 10 be located in proximity 
to one another in the genome. 

The array may comprise one or more groups of biallelic markers known to be Iccaiod on the same subchromosomal 
region. For example, the array could comprise two or more biallelic markers located at 21q11.2 ( selected from the group 
30 consisting of SEQ ID Nqs, 23, 79, 30 and 80 ), two or mora markers located at 21q21 (selected from the group consisting of 
SEQ ID Nos 1, 51, Z 52. 3 and 53], two or mors markers located at 21q21.2 (salacted from the group consisting of SEU ID 
Nos 17, 67, IB, 68, 19, 69, 20, 70, 21, and 71) , two or more markers located at 21q21.3-q22.13 (selected from the group 
consisting of SEQ 10 Nos 25, 75, 26, 78, 27, 77, 28, 78, 31, 81. 3Z 82, 38, BB, 39, 83, 40, SO, 48, BB, 43, 99, 50. 100, 
22, 71 23, 73, 24, 74, 4, 54, 5, 55, B, 56, 7, 57, 8, 58, 9, 59, 10, 60, 11, Bl, 12, 52, 13, 63, 14, 64, 15, 65, 15, and 66 
35 ), two or more maikers located at 21q212 {selected from ihe group consisting of SEQ ID Mos 41, 91, 42, 92, 43, 93. 44, 

94, 45. 95, 48, 96, 47, and 97) , and two or more markers located at 21a22.3 (selected from the group consistina of SEQ 
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ID Nos 33, 83, 34, 84, 35, 85, 36, 86, 37, and S7), AltGrnativefy, the array could compriso amplification primers capable of 
generatifiQ an amplification product containing the polymorphic bases of two or more hiallclic markers locateJ at 21q1 1.2 ( 
for exampio, amplification primors capablo of generating an amplification proiJuct conlaininp the polymorphic bases of two or 
more biallclic markers sdoclcd from the group consisting of SEQ ID Nos. 29, 7D, 3G and 80 ), iwo or more markers located 
5 at 21q21 (for example, amplification primers capable of gcncratino an amplification product containing the polymorphic 

bases of two or more LiaDeltc markers selected from the oroup consisting of SEQ ID Nos 1, 51, Z 5Z 3 and 53). two or 
more markors located at 21q21,2 (fur uxampic, amplification primers cDpoble of generating an amplificalion pnnhict 
containing the polymorphic basos of two or more biallelic markers soloctGd from the group cnnsisting of SEQ ID Nos 17, 67, 
18, GB, IG, BD, 20, 70, 21, and 71), two or more markers locnted at 2]q21.3-[i22J3 (for example, amplification pdmofs 

10 capable of generating an amplification product cnntaining the polymorplilc bases of two or mora biallelin markers selected 

from the group consisting of SEQ ID Nos 25, 75, 26. 70. 27, 77, 28. 78, 31 81, 32, 82, 33, 88, 33, 89, 40, SO, 48, 98, 49, 
99, 50, 100, IX 1% 23, 73, 24, 74, 4, H 5. 55, B, 56, 7, 57, 8, 58. 9, 59, 10, 60, 1 1, 61, 12, 62, 13, 53, 14, R 15, 
B5, 16, and GG ), two or more markers located at 21q212 ( for erarnpSc, amplification primers capable of generating an 
amplification product containing the polymorphic bases of two or marc biallelic markers soloctcd from the group cmisisting 

15 of SEQ 10 Nos 41 91, 4Z 92, 43, 93, 44, 94, 45, 95, 4G, 96, 47, and 97) , and two or more markers locateil at 21q2Z3 

(for example, amplification primers capable of generating an amplification product containing the polymorphic bases of two 
or mora biallolic markers selected fram the groap consisting of SEQ ID Nos 33, 03, 34, 84, 35, 85. 36, 85, 37, and 87). 

In some embodiments, the array may comprise one or more grcnps of biallclic markers derived from tho same BAC 
insert, for example, the array could comprisa two or more markers selected from the group consisting of SEQ ID Nos. 20, 

20 79, 30, and 80 (derived from BAC 1), two or more markers selected from the group consisting of SEQ IQ Nos. 1 and 51 

(derived from BAC 2), two or more markers selected from the group consisting of SEQ ID Nos. 2 , 52, 3, and 53 (derived 
from BAC 3), two or more markers selected from the group consisting of SEQ ID Nos. 17, B7, IB, 68, 19, 59, 20, 70, 21 
and 71 (derived from BAC 4), two or mare markers selected from the group consisting of SEQ ID Nos. 25, 75, 25, 76, 27, 
and 77 {derived from BAC 5), two or more markers sleeted from the group consisting of SEQ ID Nos, 28, 78, 31, 81, 32, and 

25 82 (derived from BAC 6), two or more markers selected from the group consisting of SEQ ID Ncs. 38, 88, 39, 89, 40, and 

90 (derived from BAC 7), two or more markers selected from the group consisting of SEQ ID Nos. 48, SB, 49, 93, 50, and 
1 00 (derived from BAC 8)i two or more markers selected from the group consisting of SEQ 10 Nos. 22, 72, 23, 73, 24, and 
74 (derived from BAC 9), two or more markers selected from tha group consisting of SEQ ID Nos. 4, 54, 5, 55, 6, 56, 7, 57, 
8, 58, 9, 59, 10, and 60 (derived from BAC 101, two or more markers selected from tho group consisting of SEQ ID Nos. 

30 1 1, 81, 12, 62, 13, 63, 14, 64, 15, 65, IS, and 66 (derived from BAC 111, two or more markers selected from the group 

consisting of SEQ 10 Nos. 41, 91, 42, 32. 43, 93, 44, 94, 45, 95, 46, 96, 47, and 97 (dorivod from BAC 12). or two or more 
markers selected from the group consisting of SEQ ID Nos. 33, 83, 34, 84, 35, 85, 36. 85, 37, and 87 (derived from BAC 
13). 

Arrays comprising biallelic markers known to be located in proximity to ona another in the genome permit 
35 haplotyping analyses to be conducted even when the chromosomal locations of the biallolic markers has not been 

determined. For examote, usinp the procediirES described above, the alleles of sets of biailelic markers which are oresent in 
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nucleic acid samples from trait positive and trat negative individuals ma^ be dDtennined using a successtan of arrays, with 
each array havinQ one or n^ore groups of nucleic acids known to be locatod in proximity lo anc another tliorcon. Tlie 
succession of arrays may comprise biailelic markers spanning ttia entira genome having any of the avcragu inlarmarJter 
distances specified abovo, Altomativdy, the successicn of arrays need not span llie entire Qcnome but may instead Lc 

5 derived from two or niorc contigatnd YAC, BAC, of BAC subclone inserts, A slatisticai analysis is psrfonnod on thu i\Mcs 

of bialioliC maikofs present in the trait pnsitlvc and trait negative individuals to identify a haplotype fiaving a stalisticaliy 
significant associalinn wiiii tiie trait* Once a statistically significant haplotypu is identified, the Qenoniic locations of the 
binllciic markers comprising the haplotype may ba dcternmmd usirty tlie methods dcscrihed herein, (n addition, using the 
procedures described herein, (he (jenainic region harbonng (he biallefic markers in the statistically significant haphilyjie may 

1 0 be evaluated to identify the genes assodatcd with the trait. 

Although this invention has boon dascribod in terms of certain preferred embodiments, other embodiments which 
will be apparent to those of ordinary skill in the art in view of the disclosure herein are also wiihin the scope of \l\iz 
invention. AccordtnQly, the scope of Iho lavontion is intended to he defined only by reference to the appended claims. 
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Table 1 



BiaJielic marker 


BAC 


[nsert size 


average Inlermarker 


subchromosomai 


(Genset code) 




(kb) 


distance (kb) 


tocatizatlon 




99-23/8 


1 


150 


75 


21q11.2 


99-2381 


1 


150 


75 


21q11.2 




1 99-2103 


2 


110 


110 


21021 1 




99-2228 


3 


105 


52.5 


21q21 


99-2229 


3 


105 


52.5 


2lq2l 




99-2312 


4 


130 


26 


21q21.2 


99-2315 


4 


130 


26 


21q2l.2 


99-2320 


4 


130 


25 


2lq21,2 


99-2321 


4 


130 


26 


21q2l.2 


99-2324 


4 


130 


26 


21q21.2 




99-2362 


5 


100 


33.3 


21q21.3-q22.l3 


99-2364 


5 


100 


33.3 


21q21.3.q22.l3 


99-2367 


5 


100 


33.3 


21q2l 3-Q22.13 




99-2371 


6 


135 


45 


21q22.n-q22.13 


99-2413 


6 


135 


45 


2lq22.11-q22.13 


i 99-2419 


6 


135 


45 


2lq22.11-q22.13 




99-2610 


7 


185 


61.7 


21q22.l1-q22J3 


99-2515 


7 


185 


61.7 


21q22.11-q22.13 


99-2620 


7 


185 


51.7 


21q22.11-q22-l3 




99-2645 


8 


250 


83.3 


2lq22-11-q22.l3 


99-2647 


8 


250 


83.3 


2lq22.11-q22.13 


99-2649 


8 


250 


83.3 


21q22-11-q22.13 




99-2333 


9 


140 


46 7 


2lq22.11-q22.l3 


99-2341 


9 


140 


46.7 


. 2lq22.11-q22.13 


99-2342 


9 


140 


45.7 


2lQ22.11-q22.13 




99-2240 


10 


95 


13.6 


21q22.11-q22.13 


99-2242 


10 


95 


13.5 


21q22.11-q22.13 


99-2244 


10 


95 


13.6 


21q22.l1-q22.13 


99-2245 


10 


95 


13.6 


21q22.11-q22.l3 


99-2248 


10 


95 


13.6 


21q22.11-q22.13 


99-2250 


10 


95 


13.6 


21q22.11-q22.13 


99-2251 


10 


95 


13.6 


21q22.11-q22.13 




99-2269 


11 


40 


6.7 


21q22.l1-q22.13 


99-2271 


11 


40 


6.7 


21q22.11-q22.13 


99-2272 


11 


40 


6.7 


21q22.11-q22.13 


99-2273 


11 


40 


6.7 


21q22.t1-q22.13 


99-2275 


11 


40 


6.7 


21q22.11.q22.13 


99-2278 


11 


40 


6.7 


21q22.11-q22J3 




99-2624 


12 


165 


23.6 


21q22.2 


99-2525 


12 


165 


23.6 


21q22.2 


99-2630 


12 


165 


23.5 


21q22.2 


99-2633 


12 


165 , 


23.6 


21q22.2 


99-2634 


12 


165 


23.6 


21q22.2 


99-2637 


12 


165 


23.6 


21q22.2 


99-2542 


12 


165 


23.6 


21q22.2 




99-2559 


13 


205 


41 


21q22.3 


99-2566 


13 


205 


41 


21q22.3 


99-2567 


13 


205 


41 


21q22.3 


99-2570 


13 


205 


41 


2lq22.3 


99-2571 


13 


205 


41 


21q22.3 
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CLAIMS 



L A method of obtaining a plurality of biallelic markers comprising the steps of: 

obtaining a nucleic acid library comprising a plurahty of genomic DNA 
5 fragments comprising the fiill genome or a portion thereof; 

determining the order of said plurahty of genomic DNA fragments in the 

genome; 

determining the sequence of selected regions of said plurality of genomic DNA 

fragments; and 

10 identifying nucleotides in said plurality of genomic DNA fragments which vary 

between individuals, thereby defining a set of biallelic markers. 

2. The method of Claim 1, fiirther comprising selecting a minimally overlapping 
set of genomic fragments from said nucleic acid library. 

3. The method of Claims 1 or 2, ftirther comprising identifying one biallelic 
15 marker per genomic DNA fragment. 

4. The method of Claims 1 or 2, ftirther comprising identifying two or more 
biallelic markers per genomic DNA fragment. 

5. The method of Claim 1, fiirther comprising detecting a set of biallelic markers 
having a desired average heterozygosity rate. 

20 6. The method of Claims 1 or 5, fiirther comprising selecting biallelic markers 

having a heterozygosity rate of at least about 0.18. 

7. The method of Claims 1 or 5, fiirther comprising selecting biallelic markers 
having a heterozygosity rate of at least about 0.32. 

8. The method of Claims 1 or 5, fiirther comprising selecting biallelic markers 
25 having a heterozygosity rate of at least about 0.42. 

9. The method of Claim 1, wherein said identifying step comprises identifying at 
least about 20,000 biallehc markers. 

10. The method of Claim 1, wherein said biallelic markers are separated from one 
another by an average distance of 10 kb - 200 kb, 

30 11. The method of Claim 1, wherein said biallelic markers are separated from one 

another by an average distance of 25 kb - 50 kb. 

12. The method of Claim 1, wherein the step of determining the sequence of 
selected regions of said plurality of genomic DNA Augments comprises inserting fragments of 
said plurality of genomic DNA fragments into a vector to generate a plurality of subclones and 
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determining the sequence of a region of the inserts in said plurality of subclones or a subset 
thereof. 

13. The method of Claim 12, wherein said step of determining the sequence of a 
region of said inserts or a subset thereof comprises determining the sequence of one or both end 
regions of said inserts or a subset thereof. 

14. The method of Claim 1, wherein a set of about 10,000 to about 30,000 genomic 
DNA inserts with an average size between 100 kb and 300 kb are ordered. 

15. The method of Claim 1, wherein said identifying step comprises identifying 
between 1 and 6 biallelic markers per genomic DNA fragment. 

16. The method of Claim 1, wherein said identifying step comprises identifying an 
average of 3 biallelic markers per genomic DNA insert, 

17. The method of Claim 1, wherein said genomic DNA fragments are in a 
Bacterial Artificial Chromosome. 

18. The method of Claim 1, fiirther comprising determining the position of said 
biallelic markers along the genome or a portion thereof 

19. The method of Claim 1, further comprising obtaining pluralities of biallelic 
markers such that each marker is in linkage disequilibrium with at least one of identified 
markers. 

20. The method of Claim 1, wherein said portion of the genome comprises at least 
200 kb of contiguous genomic DNA. 

21. The method of Claim 1, wherein said portion of the genome comprises at least 
2 Mb of contiguous genomic DNA. 

22. The method of Claim 1, wherein said portion of the genome comprises at least 
20 Mb of contiguous genomic DNA. 

23. The method of Claim 1, further comprising the step of identifying one or more 
groups of biallelic markers which are in proximity to one another in the genome. 

24. The method of Claun 23, wherein the biallelic markers in each of these groups 
are located within a genomic region spanning from 1 to 5 kb. 

25. The method of Claim 23, wherein the biallelic markers in each of these groups 
are located within a genomic region spanning from 5 kb to 1 Mb, 

26. The method of Claim 23, wherein the biallelic markers in each of these groups 
are located within a genomic region spaiming more than 1 Mb, 

27. A set of bialleUc markers obtained by the method of Claim 1, wherein the 
markers in said set are on average evenly spaced over the full genome or a portion thereof. 
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28. The set of biallelic markers of Claim 27, wherein the markers in said set are 
ordered relative to one another. 

29. The set of biallelic markers according to Claim 27 or Claim 28, wherein the 
markers in said set have a known genomic position. 

30. The set of biallelic markers of Claim 27, wherein said biallelic markers are 
separated from one another by an average distance of 100 to 150 kb. 

31. The set of biallelic markers of Claim 27, wherein said biallelic markers are 
separated from one another by an average distance of 25 to 50 kb. 

32. The set of biallelic markers of Claim 27, wherein said biallelic markers are 
separated from one another by an average distance of 10 to 200 kb. 

33. The set of biallelic markers of Claim 27, wherein said biallelic markers have a 
heterozygosity rate of at least about 0.18. 

34. The set of biallelic markers of Claim 27, wherein said biallelic markers have a 
heterozygosity rate of at least about 0.32. 

35. The set of biallelic markers of Claim 27, wherein said biallelic markers have a 
heterozygosity rate of at least about 0.42. 

36. A map comprising an ordered array of at least 20,000 biallelic markers obtained 
by the method of Claim 1. 

37. A method of identifying one or more biallelic markers associated with a 
detectable trait comprising the steps of: 

determining the frequencies of each allele of said one or more biallelic markers 
obtained by the method of claim 1 in individuals who express said detectable trait and 
individuals who do not express said detectable trait; and 

identifying one or more alleles of said one or more biallelic markers which are 
statistically associated with the expression of said detectable trait. 

38. A method of identifying a haplotype associated with a trait comprising the steps 

of 

obtaining nucleic acid samples from trait positive and trait negative individuals; 

determining the frequencies of the alleles of each member of a group of biallelic 
markers obtained by the method of claim 1 located in proximity to one another in the genome in 
said nucleic acid samples; and 

identifying a plurality of alleles of biallelic markers having a statistically 
significant association with said trait. 

39. The method of Claim 38, wherein the biallelic markers in each of these groups 
are located within a genomic region spanning from 1 to 5 kb. 
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40. The method of Claim 38, wherein the biallelic markers in each of these groups 
are located within a genomic region spanning from 5 kb to 1 Mb. 

41. The method of Claim 38, wherein the biallelic markers in each of these groups 
are located within a genomic region spanning more than 1 Mb. 

5 42. A method of identifying one or more biallelic markers associated with a 

detectable trait comprising the steps of : 

selecting a gene in which mutations result in a detectable trait or a gene suspected of being 
associated with a detectable trait; and 

identifying one or more biallelic markers obtained by the method of Claim 1 
10 within the genomic region harboring said gene which are associated with said detectable trait. 

43. The method of Claim 42, wherein said identifying step comprises: 
determining the frequencies of said one or more biallelic markers in individuals 

who express said detectable trait and individuals who do not express said detectable trait; and 

identifying one or more biallelic markers which are statistically associated with 
15 the expression of said detectable trait. 

44. An array of nucleic acids fixed to a support, said nucleic acids comprising at 
least 8 consecutive nucleotides, including the polymorphic nucleotide, of one or more biallehc 
markers obtained by the method of Claim 1. 

45. An array of nucleic acids fixed to a support, said nucleic acids comprising at 
20 least 8 consecutive nucleotides, including the polymorphic nucleotide, of one or more groups of 

biallehc markers obtained by the method of Claim 1 known to be located in proximity to one 
another in the genome. 

46. An array of nucleic acids fixed to a support, said nucleic acids comprising 
amplification primers for generating an amplification product comprising at least 8 consecutive 

25 nucleotides, including the polymorphic nucleotide, of one or more groups of biallelic markers 
obtained by the method of Claim 1 known to be located in proximity to one another in the 
genome. 

47. An array of nucleic acids fixed to a support, said nucleic nucleic acids 
comprising one or more microsequencing primers for determining the identity of the 

30 polymorphic bases of one or more groups of biallelic markers obtained by the method of Claim 
1 known to be located in proximity to one another in the genome. 

48. An array of nucleic acids fixed to a support, wherein said nucleic acids are 
complementary to one or more microsequencing primers for determining the identities of the 
polymorphic bases of one or more biallelic markers obtained by the method of Claim 1 known 

35 to be located in proximity to one another in the genome. 
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49. The array of any one of Claims 45 to 48, wherein the members of each of said 
one or more groups of biallelic markers are located in physical proximity to one another on said 
support. 

50. The array of any one of Claims 45 to 48, wherein the biallelic markers in each 
5 of these groups are located within a genomic region spanning from 1 to 5 kb. 

51. The array of any one of Claims 45 to 48, wherein the biallelic markers in each 
of these groups are located within a genomic region spanning from 5 kb to 1 Mb. 

52. The array of any one of Claims 45 to 48, wherein the biallelic markers in each 
of these groups are located within a genomic region spanning more than 1 Mb. 

10 53. The array of any one of Claims 45 to 48, wherein each group of biallelic 

markers comprises at least 3 biallelic markers. 

54. The array of any one of Claims 45 to 48, wherein each group of biallelic 
markers comprises at least 20 biallelic markers. 

55. A method for determining whether an individual is at risk of developing a 
15 detectable trait or suffers from a detectable trait associated with said trait comprising the steps 

of: 

obtaining a nucleic acid sample from said individual; 

screening said nucleic acid sample with one or more biallelic markers obtained 
by the method of Claim 1; and 
20 determining whether said nucleic acid sample contains one or more of biallelic 

markers statistically associated with said detectable trait. 

56. The method of Claim 55, wherein said biallelic markers were obtained by the 
method of Claim 37, 

57. The method of Claim 55, wherein said biallelic markers were obtained by the 
25 method of Claim 42. 

58. A method of using a drug comprising: 
obtaining a nucleic acid sample from an individual; 

determining the identity of the polymorphic base of one or more biallelic 
markers obtained by the method of Claim 1 which is associated with a positive response to 
30 treatment with said drug or one or more biallelic markers obtained by the method of Claim 1 
which is associated with a negative response to treatment with said drug; and 

administering said drug to said individual if said nucleic acid sample contains 
one or more biallelic markers associated with a positive response to treatment with said drug or 
if said nucleic acid sample lacks one or more biallelic markers associated with a negative 
35 response to said drug. 
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59. The method of Claim 58, wherein said determining step comprises determining 
the identity of the polymorphic base of one or more biallelic markers obtained by the method of 
Claim 37 which is associated with a positive response to treatment with said drug or one or 
more biallelic markers obtained by the method of Claim 37 which is associated with a negative 

5 response to treatment with said drug. 

60. The method of Claim 58, wherein said determining step comprises determining 
the identity of the polymorphic base of one or more biallelic markers obtained by the method of 
Claim 42 which is associated with a positive response to treatment with said drug or one or 
more biallelic markers obtained by the method of Claim 42 which is associated with a negative 

10 response to treatment with said drug. 

61. A method of selecting an individual for inclusion in a clinical trial of a drug 
comprising: 

obtaining a nucleic acid sample from an individual; 

determining the identity of the polymorphic base of one or more biallelic 
15 markers obtained by the method of Claim 1 which is associated with a positive response to 
treatment with said drug or one or more biallelic markers associated with a negative response to 
treatment with said drug in said nucleic acid sample; and 

including said individual in said clinical trial if said nucleic acid sample 
contains one or more biallelic markers obtained by the method of Claim 1 which is associated 
20 with a positive response to treatment with said drug or if said nucleic acid sample lacks one or 
more biallelic markers associated with a negative response to said drug. 

62. The method of Claim 61, wherein said determining step comprises determining 
the identity of the polymorphic base of one or more biallelic markers obtained by the method of 
Claim 37 which is associated with a positive response to treatment with said drug or one or 

25 more biallelic markers obtained by the method of Claim 37 which is associated with a negative 
response to treatment with said drug. 

63. The method of Claim 61, wherein said determining step comprises determining 
the identity of the polymorphic base of one or more biallelic markers obtained by the method of 
Claim 42 which is associated with a positive response to treatment with said drug or one or 

30 more biallelic markers obtained by the method of Claim 42 which is associated with a negative 
response to treatment with said drug. 

64. A method of identifying a gene associated with a detectable trait comprising the 
steps of: 
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determining the frequency of each allele of one or more biallehc markers 
obtained by the method of Claim 1 in individuals having said detectable trait and individuals 
lacking said detectable trait; 

identifying one or more alleles of one or more biallelic markers having a 
5 statistically significant association with said detectable trait; and 

identifying a gene in linkage disequilibrium with said one or more alleles. 

65. The method of Claim 64, further comprising identifying a mutation in the gene 
which is associated with said detectable trait. 

66. A method of identifying a gene associated with a detectable trait comprising: 
10 selecting a gene suspected of being associated with a detectable trait; and 

identifying one or more biallelic markers obtained by the method of Claim 1 
within the genomic region harboring said gene which are associated with said detectable trait. 

67. The method of any one of Claims 37, 38, 42, 55, 64 or 66, wherein said 
detectable trait is selected from the group consisting of disease, drug response, drug efficacy, 

15 and drug toxicity. 

68. The method of Claim 66, wherein said identifying step comprises: 
determining the frequencies of said one or more biallelic markers in individuals 

who express said detectable trait and individuals who do not express said detectable trait; and 

identifying one or more biallelic markers which are statistically associated with the 
20 expression of said detectable trait. 

69. A method of identifying a haplotype associated with a trait comprising the 
steps of: 

obtaining nucleic acid samples from trait positive and trait negative individuals; 
conducting an amplification reaction on said nucleic acid samples using amplification 
25 primers capable of generating amplification products containing the polymorphic bases of a 
plurality of biallelic markers; 

contacting one or more arrays of nucleic acids fixed to a support with said amplification 
products, wherein said nucleic acids fixed to a support comprise at least 8 consecutive 
nucleotides, including the polymorphic nucleotide, of one or more groups of biallelic markers 
30 obtained by the method of Claim 1 known to be located in proximity to one another in the 
genome; 

determining the identities of the polymorphic bases of said amplification products; and 
identifying a haplotype having a statistically significant association with said trait. 

70. A method of identifying a haplotype associated with a trait comprising the steps 

35 of: 
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obtaining nucleic acid samples from trait positive and trait negative individuals; 

conducting amplification reactions on said nucleic acid samples using 
amplification primers capable of generating amplification products containing the polymorphic 
bases of a plurality of biailelic markers; 
5 contacting one or more arrays of nucleic acids fixed to a support with said 

ampUfication products, v^herein said nucleic nucleic acids fixed to a support comprise one or 
more microsequencing primers for determining the identity of the polymorphic bases of one or 
more groups of biailelic markers obtained by the method of Claim 1 knovra to be located in 
proximity to one another in the genome; 
10 conducting microsequencing reactions on said amplification products using 

microsequencing primers on said arrays, thereby generating elongated microsequencing primers 
comprising the polymorphic bases of said amplification products; 

determining the identities of said polymorphic bases; and 

identifying a haplotype having a statistically significant association with said 

15 trait 

71. A method of identifying a haplotype associated with a trait comprising the steps 

of: 

obtaining nucleic acid samples from trait positive and trait negative individuals; 

conducting amplification reactions on said nucleic acid samples uisng 
20 ampUfication primers which are capable of generating amplification products containing the 
polymorphic bases of a plurality of biailelic markers; 

conducting microsequencing reactions on said nucleic acid samples, thereby 
generating microsequencing products containing the polymorphic bases of one or more biailelic 
markers at their 3' ends, said polymorphic bases being detectably labeled; 
25 contacting one or more arrays according to Claim 48 with said microsequencing 

products such that said microsequencing products specifically hybridize to said nucleic acids 
complementary to said microsequencing primers; 

determining the identities of the polymorphic bases of said microsequencing 

products; and 

30 identifying a haplotype having a statistically significant association with said 

trait. 

72. A method of identifying a haplotype associated with a trait comprising the steps 

of 

obtaining nucleic acid samples from trait positive and trait negative individuals; 
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contacting one or more arrays of nucleic acids fixed to a support with said 
nucleic acid sample, wherein said nucleic acids fixed to a support comprise amplification 
primers for generating an amplification product comprising at least 8 consecutive nucleotides, 
including the polymorphic nucleotide, of one or more groups of biallelic markers obtained by 
5 the method of Claim 1 known to be located in proximity to one another in the genome; 

conducting an amplification reaction on said nucleic acid samples using 
amplification primers on said array which are capable of generating amplification products 
containing the polymorphic bases of a plurality of biallelic markers; 

determining the identities of the polymorphic bases of said amplification 

10 products; and 

identifying a haplotype having a statistically significant association with said 

trait. 

73. A method of determining whether an individual is at risk of developing 
Alzheimer's disease or whether the individual suffers from Alzheimer's disease as a result of 
15 possessing the Apo E e4 Site A allele comprising: 

obtaining a nucleic acid sample fi-om said individual; and 
determining the identity of the polymorphic base in one or more of the 
sequences selected from the group consisting of SEQ ID Nos. 301-305 and SEQ ID Nos. 307- 
3 11 or the sequences complementary thereto in said nucleic acid sample. 
20 74. The method of Claim 73, further comprising determining whether said nucleic 

acid sample contains the sequence of SEQ ID No. 306 or the sequence complementary thereto. 

75. The method of Claim 73, wherein said step of determining the identity of the 
polymorphic bases in one or more of the sequences selected fi-om the group consisting of SEQ 
ID Nos. 301-305 and SEQ ID Nos. 307-311 or the sequences complementary thereto comprises 

25 determining whether said nucleic acid sample contains the sequence of SEQ ID No, 311 or the 
sequence complementary thereto. 

76. The method of Claim 75, further comprising determining whether said nucleic 
acid sample contains the sequence of SEQ ID No. 306 or the sequence complementary thereto. 

77. An isolated nucleic acid comprising a sequence selected fi-om the group 
30 consisting of SEQ ID No. 301, SEQ ID No. 307, the sequences complementary thereto, and 

fragments comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, 
thereof, 

78. An isolated nucleic acid comprising a sequence selected fi"om the group 
consisting of SEQ ID No. 302, SEQ ID No. 308, the sequences complementary thereto, and 

35 fi-agments comprising at least 8 consecutive nucleotides thereof. 
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79. An isolated nucleic acid comprising a sequence selected from the group 
consisting of SEQ ID No. 303, SEQ ID No, 309, the sequences complementary thereto, and 
fragments comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, 
thereof. 

5 80. An isolated nucleic acid comprising a sequence selected from the group 

consisting of SEQ ID No. 304, SEQ ID No. 310, fhe sequences complementary thereto, and 
fragments comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, 
thereof 

81, An isolated nucleic acid comprising a sequence selected from the group 
10 consisting of SEQ ID No. 305, SEQ ID No. 311, the sequences complementary thereto, and 

fragments comprising at least 8 consecutive nucleotides, including the polymorphic nucleotide, 
thereof 

82, An isolated nucleic acid comprising a sequence selected from the group 
consisting of SEQ ID Nos. 313-317, SEQ ID Nos. 319-323, and fragments comprising at least 8 

15 consecutive nucleotides thereof. 

83, An isolated nucleic acid comprising a sequence selected from the group 
consisting of SEQ ID Nos. 325-329, SEQ ID No^. 331-335, the sequence complementary 
thereto, and fragments comprising at least 8 consecutive nucleotides thereof 

84, A set of nucleic acids comprising amplification primers for generating an 
20 ampUfication product comprising at least 8 consecutive nucleotides, including the polymorphic 

nucleotide, of one or more biallelic markers obtained by the method of Claim 1. 

85, A set of nucleic acids comprising one or more microsequencing primers for 
determining the identity of the polymorphic base of one or more nucleic acids comprising at 
least 8 consecutive nucleotides, including the polymorphic nucleotide, of one or more biallelic 

25 markers obtained by the method of Claim 1 . 
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PROSTATE CANCER HAPLOTYPE SIMULATIONS {100 ITERATIONS) 
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KNORBH, MARTENS> OLSON & BEAR, tU 

Cpstoiuer No . 20,995 



S:\DOC&\DOIIVDOH-38».DOC 
010700 



09/463075 

WO 99/04038 ^ ^ PCT/IB98/01193 

' 428RecyPCT/PTO U JAN M 

SEQUENCE LISTING 

(1) GENERAL INFORMATION: 
(i) APPLICANT: 

{A)COHEN,Daniel 
(B)BLUMENFELD, Mart a 
(OTCHOUMAKOV, Ilia 



(ii) TITLE OF INVENTION: Biallelic markers for use in constructing a 
high density disequiiibriuin map of the human genome, 

{iii) NUMBER OF SEQUENCES: 336 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy Disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: Win95 

(D) SOFTWARE: Word 

(2} INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2103-270 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2103-270 
' (B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2103-270 

(B) LOCATION: complement 25.. 47 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
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CTTGGATTCA TATGAGACAG CTAGCAGACC TTCAATTTTT CTACACT 



(2} INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM; Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2228-301 

(B) LOCATION: 1..47 



(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 



(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2228-301 

(B) LOCATION: l.,23 



(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2228-301 

(B) LOCATION: complement 25.. 47 

I* 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



CCCTGCTTAT CCCTGTAAGG TGGAGACCCA TATGGGCAAG GCCAGAC 



{2} INFORMATION FOR SEQ ID NO; 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2229-240 

(B) LOCATION: 1..47 
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(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 2 4 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2229-240 

(B) LOCATION: l.,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2229-240 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



TCGTCATCGT GGCCTGGGCT ACAGACTACC TGTTCCAGTC CTTCCAG 



{2) INFORMATION FOR SEQ ID NO: 4: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(11) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2240-281 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2240-281 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2240-281 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



GCAATCTTAA TAACTTTTTA TTTCAGTAAT TCGAATCTTT TTTTTCT 



47 
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(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2242-206 

(B) LOCATION: 1_47 

{ix) FEATURE; 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 2 4 

(D) OTHER INFORMATION: base c 

{ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oliqo 99-2242-206 
CB) LOCATION: 1..23 

(ix) FEATURE: 

{A) NAME/KEY: Potential microsequencing oligo 99-2242-206 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



GTGTTTTCTT TTAGTCAAAT TATCTTATAT TTTACTTTTT TCTTAAG 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2244-83 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 
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{B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2244-83 
{B) LOCATION: l.,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2244-83 

(B) LOCATION: complement 25., 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



TAATTGTAGA TACTAAGACC ATTATGCTTA AACCATGTAG GTACTGA 



(2) INFORMATION FOR SEQ ID NO: 7; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY; polymorphic fragment 99-224 6-340 

(B) LOCATION: 1. .47 



{ix} FEATURE: 

(A) NAME/KEY: polymorphic base 

{B} LOCATION: 24 

(D) OTHER INFORMJ^TION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2246-340 

(B) LOCATION: 1,,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2246-340 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



ATTTATATGT TAAATGCAGA GAAAAAGAAA AATAAGTTTT GCAGTAA 



(2) INFORMATION FOR SEQ ID NO: 8: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) B^EATURE: 

(A) NAME/KEY: polymorphic fragment 99*22^8-7tS 

(B) LOCATION: 1..^7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2248-76 
{B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2248-76 

(B) LOCATION; complement 25.. 47 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 8; 



GACAGAGAGG GAAGGTAATC TTCCCCTGAA GTCTGCCCAT CCCCTGG 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2250-236 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2250-236 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2250-236 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



ATGTATCCAA AACAGAATTA ACACACTTTG GGTTTTTTAT TTTTATT 



(2) INFORMATION FOR SEQ ID NO: 10: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2251-151 

(B) LOCATION: l.,47 

(ix) FEATURE; 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequenci no oligo 99-2251-151 

(B) LOCATION: l.,23 

(ix)' FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2251-151 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



TGAAAAGAAG TTCAGACGAT TGCAGATAGA CTAGTTTGGC TGTTGTG 



{2} INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: liomo sapiens 

{ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2269-179 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2269-179 

(B) LOCATION: 1..23 

(IX) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2269-179 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AAAATAAAGA AATTCCTAGA GACATACAGC CTATCAAGAT CAAACCA 4 7 



(2) INFORMATION FOR SEQ ID NO: 12: 

1^ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2271-403 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2271-403 

(B) LOCATION: 1. .23 
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(ix) FEATURE: 

{A} NAME/KEY: Potential microsequencing oiigo 99-2271-403 
(B) LOCATION: compleinent 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



AGGCATTTAT TTCATATTTA TTAACCTTGA TTTTCTTATC TTCAAGT 



(2) INFORMATION FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



Cvi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2272-409 

(B) LOCATION: 1..47 



(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 2 4 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2272-409 

(B) LOCATION: l.,23 



(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2272-409 

(B) LOCATION: complement 25,, 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



AAAAGCACTG CAATTATTTT GGAGACTGTG AAATATTGCA AGTTTTA 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2273-528 

(B) LOCATION: 1 . . ^1 

{ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2273-528 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2273-528 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1^: 



ACTTGAAGAT AAGAAAATCA AGGCTAATAA ATATGAAATA AATGCCT 



(2) INFORMATION FOR SEQ ID NO: 15: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2275-4 66 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 



(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2275-466 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2275-466 
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(B) LOCATION: complement 25.. 47 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TTGATGATAG CATTAAATAC TCCCAAAAAC TGTGAATAGG GATACTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B} TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2278-276 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

{B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2278-276 

(B} LOCATION: 1..23 

if 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2278-276 

(B) LOCATION: complement 25,. 47* 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



GAAAAAAATG GGAACATCTT CACAGCCTGT GCATCTCCAA CAAGATT 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 
(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2312-358 

(B) LOCATION: 1,.4 7 

(ix) FEATURE: 

{A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2312-358 

(B) LOCATION: l.,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2312-356 

(B) LOCATION: complement 25.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



TTGAAGAGAG AGATGGAAAA AAACGTAGGC CTTCTGGGTA AATGGCC 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY; polymorphic fragment 99-2315-213 

(B) LOCATION: l.,47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

CD) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2315-213 

(B) LOCATION: 1,.23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2315-213 

(B) LOCATION: complement 25.. 47 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
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AGATGGATTC TACCCACAGG CAAAAGAAAA CCTTATTTTA AAAATAA 4 7 



{2) INFORMATION FOR SEQ ID NO: 19: 

(i) r-IEQUENCE CflARACTERISTICS ; 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2320-292 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencina oligo 99-2320-292 

(B) LOCATION: 1_23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligcf 99-2320-292 

(B) LOCATION: complement 25. .47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
ACTCTCATTC ACTAAACTTC AACCGTTTTT ATAAATTTAA TGAATTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 20: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 
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(A) NAME/KEY: polymorphic fragment 99-2321-82 

(B) LOCATION: 1..47 

(ix) FEATURE: 

{A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2321-82 

(13) LOCATION: 1,,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2321-82 

(B) LOCATION: complement 2 5,. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



TAAAGCTTAC TGAGTGTCCA CTCCGGATAC CTACTCAAAT ATTTCCT 



(2} INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2324-338 

(B) LOCATION: l.,47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2324-338 

(B) LOCATION: 1,.23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2324-338 

(B) LOCATION: complement 2 5*, 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



AGATAGAAGA CAAAATCGCA GGAAAAGAAA TCCCTCAACA GTAAAAA 



47 
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{2} INFORMATION FOR SEQ ID NO: 22: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: Al base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 

(vi} ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2333-423 

(B) LOCATION: 1..47 



(ix} FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 2 A 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2333-^)23 

(B) LOCATION: 1..23 

{ix} FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2333-423 

(B) LOCATION: complement 25.. 47 

(xi} SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



GAGACGCTAT CTATGCAAGG AGGGTGTTCA ACATTTGGAC AGCCACG 



(2) INFORMATION FOR SEQ ID NO; 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A} ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2341-485 

(B) LOCATION: 1. .47 
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(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION; 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2341-485 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2341-485 

(B) LOCATION: complement 2 5 ,,4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
ACACATCTGT CTGTTACCTA CACCTTACAA AGAATCGCAC AGGCTCT 4 7 

{2} INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(Al LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi} ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-234 2-217 
{B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

CA) NAME/KEY: Potential microsequencing oligo 99-2342-217 
(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2342-217 

(B) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



TAGAGCCTTG GACTTTCATG ACACTTCTAG AAACAGCCCA GATTGTG 



47 
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(2) INFORMATION FOR SEQ ID NO: 25: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM; Homo sapiens 

(ix) FEATURE; 

(A) NAME/KEY: polymorphic fragment 99-2362-270 

(B) LOCATION: 1,.^7 



(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 



(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2362-270 

(B) LOCATION: 1..23 

{i:<) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2362-270 

(B) LOCATION: complement 25.. 47 

SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



TCTCTCTTGG GTGGTTCCTC AACATGTGTG ACCTTGACCA AGTATTG - ' 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2364-329 

(B) LOCATION: 1..47 

(ix) FEATURE: 

{A) NAME/KEY: polymorphic base 
(B) LOCATION: 24 
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(D) OTHER INFORMATION: base g 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-236^-329 
(Dj LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-23G/1-329 

(B) LOCATION: complement 25.. O 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
ATATAAAATG ATGAACCATA TACGTGAGGC AAGGTAACAT ATAATTG 4 7 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2367-61 
{B) LOCATION: l.,47 

(ix) FEATURE: 

{A} NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2367-61 

(B) LOCATION: 1..23 

(ix) FEATURE; 

(A) NAME/KEY: Potential microsequencing oligo 99-2367-61 

(B) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
TAAACATTTC ATTATTTCAG AAAATAATAT GCATTTTCAC CAACACA 4 7 



(2) INFORMATION FOR SEQ ID NO: 28: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2371-93 
(D) LOCATION: l.,47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2371-93 

(B) LOCATION: 1 . .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2371-93 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



CTCTAAACTT TCCTAATACT TACATCACTG CCTACTTTTT ACATAAT 



(2) INFORMATION FOR SEQ ID NO: 29; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2378-200 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 
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(A) NAME/KEY: Potential microsequencing oligo 99-2378-200 

(B) LOCATION; 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2378-200 

(B) LOCATION: complement 25.. 47 

(xi) .SEQUENCE DESCRIPTION: SEQ ID NO: 29: 



GAGAACTTCC TGTTGAACCT GTTATAGAAC TGTCCTGTCG TCCAAGA 



(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii} MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

{A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2381-394 

(B) LOCATION: 1..47 



(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2381-394 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2381-394 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



AGTGGTCTTC AGGTTATTGG TAGAGAAAAG TAGGGGAGCT AAAGGTG 



(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2413-368 

(B) LOCATION: 1,.4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2413-368 

(B) LOCATION: 1..23 

(ix) FEATURE; 

(A) NAME/KEY: Potential microsequencing oligo 99-2413-368 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 31: 



ATTTTAAGAG GAAAACTTAA TGGAAGAATT GTACATAATA TTTCATT 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2419-285 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2419-285 

(B) LOCATION: 1,.23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2419-285 

(B) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
AAGGGATCAA GCAGTGCCCA CTCCCCACCC TCCAGGGAGC TGTGACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(3) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(3) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2559-253 
(3) LOCATION: l.,4 7 

(i:<) FEATURE: 

CA) NAME/KEY: polymorphic base 

(3) LOCATION: 24 

(0) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2559-253 
(3) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2559-253 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
CAGGTGTTTT CATGCCCTCT TAGGGTGTGT CACATCATCC ATCTCAA 47 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic frogment 99-2566-112 

(B) LOCATION: l../i7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(i:-:) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2566-112 
fB) LOCATION: l.,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2566-112 
{B) LOCATION: complement 25.. ^7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



GCCTTCACAA CCGCAGAGGC AAGAGAAGGA GCTTGGCCAC CCTGACT 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: ^1 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(li) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2567-329 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY; polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2567-329 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2567-329 

(B) LOCATION: complement 25., 47 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
CACTGTCAGA TATGAAATGA TGCGTGGCTT TCTTTGGGCT ATATTTG ^7 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) iJEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
CC} STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{i:<) FEATURE: 

(A) NAME/KEY: polymorphic fragment 59-2570-218 

(B) LOCATION: 1..4 7 

{i:<) FEATURE: 

(A} NAME/KEY: polymorphic base 

(B} LOCATION: 24 

(D) OTHER INFORMATION: base c 

(i:<) FEATURE; 

(A) NAME/KEY: Potential microsequencing oiigo 99-2570-218 

(B) LOCATION: l.,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2570-218 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GGAAJ\GTTCC AAATTATGAG AAGCGAGGCC TCTGAAGTGG CTAAGTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



{vi} ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2571-242 

(B) LOCATION: 1..47 

(ix) FEATURE: 

{A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY; Potential microsequencing oligo 99-2571-242 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2571-242 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 



ATAATGAATG AGTATTTGAT ATTATATAAT TAAATGTGTC AGCATTT 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA ^ . 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2610-121 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2610-121 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2610-121 

(B) LOCATION: complement 25.. 47 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



wo 99/04038 




PCT/IB98/01193 



ATACCCCTTC CCTAGGTATG GCTATATGCT GCACTTAGAA AATTCTC 



(2) INFORMATION FOR SEQ ID NO: 39: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 
{B) TYPE: NUCLEIC ACID 
CC) STRANDEDNESS: SINGLE 
(U) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2615-83 

(B) LOCATION: 1..47 



(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2615-83 

(B) LOCATION: I,. 23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2615-83 

(B) LOCATION: complement 25.. 47 . - 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



AACAAATCAC AAGTTGGCAA AAGCAGCAAA TTCTCATCTT CTGGGAA 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2620-227 
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(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2620-227 

(B) LOCATION: 1..23 

{ix} FEATURE: 

(A) NAME/KEY: Potential microsoquoncing oligo 99-2G20-227 

(B) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



TTGACTGGGC TCCTGATGTG TCCAGGGTAT CTTGCTGGCT GTTTTGC 



(2) INFORMATION FOR SEQ ID NO: 41: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens . " 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2624-407 

(B) LOCATION: 1..4 4 



(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2624-407 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2624-407 

(B) LOCATION: complement 25.. 44 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 



ATCTGGCCAT AGGCAGAACA TTGGGGGAGA GATGGGGAAA GAGA 



44 
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(2) INFORMATION FOR SEQ ID NO: 42: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 
(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNES5: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2625-70 

(B) LOCATION; 1..47 



(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 



(IX) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2625-70 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing olicjo 99-2025-70 
(3) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
AGTGACTCAA CCAGAAAGAG AGCAGGAGAG AGGACGAAGA GAGGAGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2630-67 

(B) LOCATION: 1..47 



(ix) FEATURE: 
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(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2630-67 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2630-67 
{B) LOCATION: complement 25.. 47 

(xi) GEOUENCE DESCRIPTION; SEQ ID NO: 43: 



TAAATTCTGC CTAGAAGATT AAGATTGGTC CAGAACAGGG AGTGTTT 



(2) INFORMATION FOR SEQ ID NO: AA: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAHE/KEY: polymorphic fragment 99-2633t129 

(B) LOCATION: 1..47 



(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 



(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2633-129 

(B) LOCATION: l.,23 



(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2633-129 

(B) LOCATION: complement 25.. 47 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



TAGCTATTTC TTCCCCTAGG CAAAGTAGAC AATGAGAGAA CCCTTGA 



(2) INFORMATION FOR SEQ ID NO: 45: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS; SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi ) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2G3^-341 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY; polymorphic base 

(B) LOCATION: 2A 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oliqo 99-2634-34 1 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2634-34 1 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
GGAuATCAATA TTTATTTATT ATCAACAGGT GAGACATTAT TTATTTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2637-28 

(B) LOCATION: 1..47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 
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{ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2637-28 

(B) LOCATION; 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2637-28 

(B) LOCATION: complement 25,. ^7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
CCATCACTTC CTCCTAGTGA AAAATCAAAG GAGGGTGGGT TTTATAG 4 7 

(21 INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2642-255 

(B) LOCATION: 1_47 

(ix) FEATURE: ^ . 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2642-255 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2642-255 

(B) LOCATION: complement 25.. 47 

(Ki) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
TGAGGGTGTT TCCAGAAGAG ACTAGCATTT GAATCTGAAG TGAGTAA 4 7 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 
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(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE; DNA 

(vi) ORIGINAL SOURCE: 

{A) ORGANISM: I^omo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2645-118 

(B) LOCATION: 1..^7 

(ix) E^EATURE: 

{A} NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 

(ix) FEATURE: 

{A) NAME/KEY: Potential microsequencing oligo 99-2645-118 
(B) LOCATION: 1..23 

(ix) FEATURE: 

{A) NAME/KEY: Potential microsequencing oligo 99-2645-118 
(B) LOCATION: complement 25., 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
CACAAATTAA TTGCATTGTT ATAGGCTAGC AATGAAGAAT CTGAAAA 4 7 



{2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2647-368 

(B) LOCATION: 1. .47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a 



(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2647-368 
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(B) LOCATION: 1,.23 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2647-368 

(B) LOCATION; complement 2 5.. 4 7 

(Ki) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
TTAAGGCCTT CAACTGATTA GACAAGGCCC ACTCACATTA TCTGACA 4 7 



(2) INFORMATION FOR SEQ ID NO: 50: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 
{B) TYPE: NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(iil MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

{A) NAME/KEY: polymorphic fragment 99-2649-107 
{D) LOCATION: 1..47 

(iX) FEATURE: 

{A) NAME/KEY: polymorphic base 

{B) LOCATION: 24 

(D) OTHER INFORMATION: base a ^ . 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2649-107 

(B) LOCATION: l.,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-264 9-107 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
CACAACTCTG GAGCCTTTTA TGAACAGGAC AGCAATGCAC TGAAACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE; 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2103-270 

(B) LOCATION: l.,47 

{D) OTHER INFORMATION: variant version ol* SEQ IDl 

(Ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; g in SEQ IDl 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2103-270 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2103-270 

(B) LOCATION: complement 25., 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



CTTGGATTCA TATGAGACAG CTACCAGACC TTCAATTTTT CTACACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: . - 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2228-301 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID2 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID2 
(ix) FEATURE: 

(A) NAME/KEY; Potential microsequencing oligo 99-2228-301 

(B) LOCATION: 1..23 
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(ix) FEATURE; 

{A) NAME/KEY: Potential microsequencing oiigo 99-2228-301 
(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 



CCCTGCTTAT CCCTGTAAGG TGGGGACCCA TATGGGCAAG GCCAGAC 4 7 



(2} INFOI^MATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
{C} STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hoino sapiens 

{ix} FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2229-240 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID3 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base C; g in SEQ ID3 ' 
(ix) FEATURE: 

(A) NAME/KEY: Potential microseauencing oligo 99-2229-240 

(B) LOCATION; 1. .23 

{ix} FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2229-240 

(B) LOCATION: complement 25,. 47 

{xi} SEQUENCE DESCRIPTION: SEQ ID NO: 53: 



TCGTCATCGT GGCCTGGGCT ACATACTACC TGTTCCAGTC CTTCCAG 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY; polymorphic fragment 99-22^0-281 
(D) LOCATION: 1,,47 

(D) OTHER INFORMATION: variant version o£ SEQ ID4 

(i.x) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION; 24 

(D) OTHER INFORMATION: base t; c in SEQ ID-J 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oliqo 99-22^0-281 

CB) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2240-281 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 



GCAATCTTAA TA/tCTTTTTA TTTTAGTAAT TCGAATCTTT TTTTTCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: . ' 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2242-206 
IB) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID5 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID5 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2242-206 

(B) LOCATION: 1. .23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2242-206 
(D) LOCATION: complement 25,, 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



CTGTTTTCTT TTAGTCAAAT TATTTTATAT TTTACTTTTT TCTTAAG 4 7 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM; Homo sapiens 

(ix} FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-224 4-83 

(B) LOCATION: l.,47 

(D) OTHER INFORMATION: variant version of SEQ ID6 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID6 . 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2244-83 

(B) LOCATION: 1. .23 

(ix) FEATURE; 

(A) NAME/KEY: Potential microsequencing oligo 99-224 4-83 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 



TAATTGTAGA TACTAAGACC ATTGTGCTTA AACCATGTAG GTACTGA 4 7 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-22^6-3^0 

(B) LOCATION: l.,47 

(D) OTHER INFORMATION: variant version of SEQ ID7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID7 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2246-340 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2246-340 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 



ATTTATATGT TAAATGCAGA GAAGAAGAAA AATAAGTTTT GCAGTAA 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
fC) STRANDEDNESS: SINGLE 
[D] TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-224 8-76 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ IDS 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ IDS 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2248-76 

(B) LOCATION: 1,.23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2248-76 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 



GACAGAGAGG GAAGGTAATC TTCTCCTGAA GTCTGCCCAT CCCCTGG 4 7 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D} TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

ax) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2250-236 

(B) LOCATION: 1,.4 7 

(D) OTHER INFORMATION: variant version of SEQ ID9 

(ix) FEATURE: 

[A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

[D) OTHER INFORMATION: base t; c in SEQ IDS ' 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2250-236 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2250-236 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



ATGTATCCAA AACAGAATTA ACATACTTTG GGTTTTTTAT TTTTATT 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 47 base pairs 

(B) TYPE: NUCLEIC ACID 
CO STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY; polymorphic fragment 99-2251-151 
(D) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ IDIO 

(ix) FEATURE: 

{A) NAME/KEY: polymorphic base 

(B) LOCATION: 2^ 

(D) OTHER INFORMATION: base g; a in SEQ IDIO 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2251-151 

(B) LOCATION: 1,.23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2251-151 
{B) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
TGAAAAGPJkQ TTCAGACGAT TGCGGATAGA CTAGTTTGGC TGTTGTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: . " 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2269-179 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ IDll 

(ix) FEATURE: 

(A) NAME/KEY; polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ IDll 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2269-179 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2269-179 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 



AAAATAAAGA AATTCCTAGA GACGTACAGC CTATCAAGAT CAAACCA 47 



(2) INroR^^ATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 
CB) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

{ii} N30LECULE TYPE: DNA 

{vi} ORIGINAL SOURCE: 

{A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

{A} NAME/KEY: polymorphic fragment 99-2271-403 

(B) LOCATION: 1..47 

(DJ OTHER INFORMATION: variant version of SEQ ID12 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(01 OTHER INFORMATION: base g; a in SEQ IDL2 ^ 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2271-403 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2271-403 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 



AGGCATTTAT TTCATATTTA TTAGCCTTGA TTTTCTTATC TTCAAGT 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



wo 99/04038 




PCT/IB98/01193 



(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2272-^09 

(B) LOCATION: 1,«47 

(D) OTHER INFORMATION: variant version of SEQ ID13 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 
{B) LOCATION: 24 

{D) OTHER INFORMATION: base t; g in SEQ ID13 
(ix) FEATURE: 

{A) NAME/KEY: Potential microsequencing oligo 99-2272-409 

(B) LOCATION: 1,.23 

(ix) FEATURE: 

{A} NAME/KEY: Potential microsequencing oligo 99-2272-409 
(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 



AAAAGCACTG CAATTATTTT GGATACTGTG AAATATTGCA AGTTTTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: , - 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2273-528 

(B) LOCATION: 1, .47 

(D) OTHER INFORMATION: variant version of SEQ ID14 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID14 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2273-528 

(B) LOCATION: 1,.23 
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(ix) FEATURE: 

(A) NAME/KEY; Potential microsequencing oligo 99-2273-528 

(B) LOCATION: complement 25.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 



ACTTGAAGAT AAGAAAATCA AGGTTAATAA ATATGAAATA AATGCCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2275-4 66 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID15 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c m SEQ ID15 ' 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2275-466 

(B) LOCATION: l.,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2275-466 

(B) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 



TTGATGATAG CATTAAATAC TCCTAAAAAC TGTGAATAGG GATACTA 4 7 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2270-2'76 

(B) LOCATION: l^.^l 

(D) OTHER INFORMATION: variant version of SEQ ID16 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID16 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2278-276 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2278-27G 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 



GAAA.AAAATG GGAACATCTT CACGGCCTGT GCATCTCCAA CAAGATT 4 7 



(2) INFORMATION FOR SEQ ID NO: 67; 

(i) SEQUENCE CHARACTERISTICS: , " 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2312-358 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID17 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION; 24 

(D) OTHER INFORMATION: base t; c in SEQ ID17 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2312-358 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2312-358 

(B) LOCATION: complement 25.. 47 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 



TTGAAGAGAG AGATGG7\AAA AAATGTAGGC CTTCTGGGTA AATGGCC 
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{2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: Al base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2315-213 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID18 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID1,8 " 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2315-213 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2315-213 

(B) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 



AGATGGATTC TACCCACAGG CAAGAGAAAA CCTTATTTTA AAAATAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM; Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2320-292 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID19 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID19 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2320-292 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2320-292 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



ACTCTCATTC ACTAAACTTC AACTGTTTTT ATAAATTTAA TGAATTT 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

{A} LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2321-82 

(B) LOCATION: 1,.47 

(D) OTHER INFORMATION: variant version of SEQ ID20 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID20 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2321-82 

(B) LOCATION: 1,.23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2321-82 

(B) LOCATION: complement 25,. 47 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 



TAAAGCTTAC TGAGTGTCCA CTCTGGATAC CTACTCAAAT ATTTCCT 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE; NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2324-338 

(B) LOCATION: 1,.0 

(D) OTHER INFORMATION: variant version of SEQ ID21 

(ix} FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID21- ' 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2324-338 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2324-338 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 



AGATAGAAGA CAAAATCGCA GGACAAGAAA TCCCTCAACA GTAAAAA 4 7 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

{A) NAME/KEY: polymorphic fragment 99-2333-'123 

(B) LOCATION: 

(D) OTHER INFORMATION: variant version of SEQ ID22 

(ix) FEATURE: 

{A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID22 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2333-423 

(B) LOCATION: 1,.23 

(ix) FEATURE: 

[h) NAME/KEY: Potential microsequencing oligo 99-2333-423 
[B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 



GAGACGCTAT CTATGCAAGG AGGTTGTTCA ACATTTGGAC AGCCACG 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2341-485 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID23 

(ix) FEATURE: 

(A) NAME/KEY; polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID23 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2341-485 

(B) LOCATION: 1,.23 
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{ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2341-485 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 



ACACATCTGT CTGTTACCTA CACTTTACAA AGAATCGCAC AGGCTCT 



(2) INFORMATION FOR SEQ ID NO: 74: 

{i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2342-217 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID24 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID24 . 
(ix) FEATURE: 

(A) NAME/KEY; Potential microsequencing oligo 99-2342-217 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2342-217 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 



TAGAGCCTTG GACTTTCATG ACATTTCTAG AAACAGCCCA GATTGTG 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2362-270 

(B) LOCATION: l.,47 

{D) OTHER INFORMATION: variant version of SEQ ID25 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID25 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2362-270 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2362-270 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 



TCTCTCTTGG GTGGTTCCTC AACGTGTGTG ACCTTGACCA AGTATTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2364-329 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID26 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; g in SEQ ID26 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2364-329 

(B) LOCATION: 1. .23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencmg oiigo 99-2364-329 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 



ATATAAAATG ATGAACCATA TACCTGAGGC AAGGTAACAT ATAATTG 4 7 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

{11} MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2367-61 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID27 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID27 , 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencmg oligo 99-2367-61 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencmg oligo 99-2367-61 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 



TAAACATTTC ATTATTTCAG AAAGTAATAT GCATTTTCAC CAACACA 4 7 



(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2371-93 

(B) LOCATION: 1..47 

{D} OTHER INFORMATION: variant version of SEQ ID28 

{ix} FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID28 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2371-93 

(B) LOCATION: 1_23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2371-93 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
CTCTAAACTT TCCTAATACT TACCTCACTG CCTACTTTTT ACATAAT 4 7 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: , ' 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(i-i) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2378-200 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID29 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID29 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2378-200 

(B) LOCATION: 1..23 
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(ix} FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2378-200 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 



GAGAACTTCC TGTTGAACCT GTTGTAGAAC TGTCCTGTCG TCCAAGA 4 7 



{21 INFORMATION! FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY; polymorphic fragment 99-2381-394 

(B) LOCATION: 1..47 

{D) OTHER INFORMATION: variant version of SEQ ID30 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID30- 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2381-394 

(B) LOCATION: l.,23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2381-394 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 



AGTGGTCTTC AGGTTATTGG TAGGGAAAAG TAGGGGAGCT AAAGGTG 47 



(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2413-368 
{B} LOCATION: 1..4 7 

(D) OTHER INFORMATION: variant version of SEQ ID31 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID31 
(ix) FEATURE: 

{A) NAME/KEY: Potential microsequencing oligo 99-2413-368 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2413-368 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
ATTTTAAGAG GAAAACTTAA TGGGAGAATT GTACATAATA TTTCATT 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2419-285 
(3) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID32 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID32 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2419-285 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2419-285 

(B) LOCATION: coinplement 25.. 4 7 

(xi) r^EOUENCE DESCRIPTION: SEQ ID NO: 82: 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

CB) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2559-253 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID33 

(ix) FEATURE; 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID33^ 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2559-253 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2559-253 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 



CAGGTGTTTT CATGCCCTCT TAGTGTGTGT CACATCATCC ATCTCAA 



(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D} TOPOLOGY: LINEAR 



AAGGGATCAA GCAGTGCCCA CTCTCCACCC TCCAGGGAGC TGTGACT 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2566-112 

(B) LOCATION: 1_47 

(D) OTHER INFORMATION: variant version of SEQ ID3'1 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(D) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID34 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2566-112 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2566-112 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 



GCCTTCACAA CCGCAGAGGC AAGGGAAGGA GCTTGGCCAC CCTGACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: - ' 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2567-329 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID35 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID35 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2567-329 

(B) LOCATION: 1..23 
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(ix) FEATURE; 

(A) NAME/KEY: Potential microsequencing oiigo 99-2567-329 

(B) LOCATION: complement 25.. 47 



(Ki) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 



CACTGTCAGA TATGAAATGA TGCTTGGCTT TCTTTGGGCT ATATTTG 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2570-218 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID36 

(ix) FEATURE; 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID3^ " 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2570-218 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2570-218 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 



GGAAAGTTCC AAATTATGAG AAGTGAGGCC TCTGAAGTGG CTAAGTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2571-242 
(D) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID37 

{ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID37 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2571-242 

{B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2571-242 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 



ATAATGAATG AGTATTTGAT ATTGTATAAT TAAATGTGTC AGCATTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY; polymorphic fragment 99-2610-121 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID38 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID38 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2610-121 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2610-121 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 



ATACCCCTTC CCTAGGTATG GCTCTATGCT GCACTTAGAA AATTCTC 



(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2615-83 

(B) LOCATION: 1..4 7 

(D) OTHER INFORMATION: variant version of SEQ ID39 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(E) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID39 - 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2615-83 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2615-83 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 



AACAAATCAC AAGTTGGCAA AAGTAGCAAA TTCTCATCTT CTGGGAA 



(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2620-227 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID4 0 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION; 24 

(D) OTHER INFORMATION: base g; a in SEQ ID40 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2620-227 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2G20-227 

(B) LOCATION: complement 2 5.. 4 7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
TTGACTGGGC TCCTGATGTG TCCGGGGTAT CTTGCTGGCT GTTTTGC 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: , - 

(A) LENGTH: 44 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2624-407 

(B) LOCATION: 1..4 4 

(D) OTHER INFORMATION: variant version of SEQ ID41 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; g in SEQ ID41 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2624-407 

(B) LOCATION: l.,23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2624-407 

(B) LOCATION: complement 25.. 44 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 



ATCTGGCCAT AGGCAGAACA TTGTGGGAGA GATGGGGAAA GAGA 4 4 



(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{ix} FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2625-70 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID42 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ 1 04 2 ' 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2625-70 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2625-70 

(B) LOCATION: cornpleinent 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 



AGTGACTCAA CCAGAAAGAG AGCGGGAGAG AGGACGAAGA GAGGAGA 4 7 



[2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2630-67 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID43 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 2A 

(D) OTHER INFORMATION: base g; a in SEQ ID4 3 
(ix) FEATURE; 

(A) NAME/KEY: Potential microsequencing oligo 99-2630-67 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2630-67 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 



TAAATTCTGC CTAGAAGATT AAGGTTGGTC CAGAACAGGG AGTGTTT 4 7 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: , - 

(A) LENGTH: 41 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2633-129 

(B) LOCATION: 1*.47 

(D) OTHER INFORMATION: variant version of SEQ ID44 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; a in SEQ ID44 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2633-129 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2633-129 

(B) LOCATION: complement 25,. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9^1: 



TAGCTATTTC TTCCCCTAGG CAACGTAGAC AATGAGAGAA CCCTTGA Al 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: MUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2634-3'U 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID4 5 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID4S , 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2634-341 

(B) LOCATION: 1. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2634-341 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 



GGAATCAATA TTTATTTATT ATCGACAGGT GAGACATTAT TTATTTA 47 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2637-2U 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of GEO ID46 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID46 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencmg oligo 99-2637-28 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 90-2637-28 

(B) LOCATION: complement 25.. 47 

(Ki) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 



CCATCACTTC CTCCTAGTGA AAAGTCAAAG GAGGGTGGGT TTTATAG 4 7 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: ' , - 

{A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2642-255 

(B) LOCATION: 1,,47 

(D) OTHER INFORMATION: variant version of SEQ ID47 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID47 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2642-255 

(B) LOCATION: 1..23 
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{ix} FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2642-255 
{B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 



TGAGGGTGTT TCCAGAAGAG ACTGGCATTT GAATCTGAAG TGAGTAA 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2645-118 

(B) LOCATION: 1,.47 

{D) OTHER INFORMATION: variant version of SEQ ID4 8 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

CD) OTHER INFORMATION: base t; g in SEQ ID4 8 ^ 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2645-118 

(B) LOCATION: 1,.23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2645-118 

(B) LOCATION: complement 25.. 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 



CACAAATTAA TTGCATTGTT ATATGCTAGC AATGAAGAAT CTGAAAA 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2647-3(38 

(B) LOCATION: 1,.47 

(D) OTHER INFORMATION: variant version of SEQ 1049 

{i:<) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID4 9 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2647-368 

{B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-260-368 

(B) LOCATION: complement 25.. 47 

(:<il SEQUENCE DESCRIPTION: SEQ ID NO: 99: 



TTAAGGCCTT CAACTGATTA GACGAGGCCC ACTCACATTA TCTGACA 4 7 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: , " 

{A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-264 9-107 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID50 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION; 24 

(D) OTHER INFORMATION: base t; a in SEQ ID50 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-2649-107 

(B) LOCATION: 1..23 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-2649-107 

(B) LOCATION: complement 25., 47 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 



CACAACTCTG GAGCCTTTTA TGATCAGGAC AGCAATGCAC TGAAACT 4 7 



(2) INFORMATION FOR SEQ ID NO: 101: 

ix) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ IDl, SEQ ID51 

(B) LOCATION: 1 . , 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 



CCTGGATTCT GACCCATC 18 



(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID2, SEQ ID52 

(B) LOCATION: l,.ie 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 



TCTACCTCTA CCTCTTTC 



18 
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(2) INFORMATION FOR SEQ ID NO: 103: 

{i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

tii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(Al NAME/KEY: upstream amplification primer for SEQ ID3, SEQ ID53 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 



CTTCCCATAC CTCTGATAC 19 



(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID4, SEQ ID54 
{B) LOCATION: 1,.18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 



TTCAACAGTG AAGCCATC 18 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

fix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID5, SEQ ID55 
(D) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
TGATGTGTGT GACTCAGG id 



(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY; LINEAR 

(ii) MOLECULE TYPE: DNA 



{vi} ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{i:-:) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ 106, SEQ ID56 

(B) LOCATION: 1. .18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 



ATAGAGGAAC CAAACCTG 



(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID7, SEQ ID57 
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(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

AGCAGCATGG AAGCAAAC 18 

(2} INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CilARACTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:<) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID8, SEQ ID58 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 
CTGATGAAAG TGGCTCTC 18 



(2) INFORMATION FOR SEQ ID NO: 109: . ^ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE; 

(A) NAME/KEY: upstream amplification primer for SEQ ID9, SEQ ID59 

(B) LOCATION: l.,19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 109: 



TGTATCTGAG GTCTAAAAC 



19 
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(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE; NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

{A) NAME/KEY: upstream amplification primer for SEQ IDIO, SEC IDGO 

(B) LOCATION: 1,.18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 



TATATGTAGA GGGTGAGG 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
{C} STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ IDll, SEQ ID61 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 



AGGCTAAGAA AAAAAGAGG 



(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE; 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE; 

(A) NAME/KEY: upstream amplification primer for S£Q ID12, SEQ ID62 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 112: 
TGAAAAGACT AAGTTCTGG jtj 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 
iC) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID13, SEQ ID63 
{B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
ATGCTAGAGG AAAGGAAC -^g 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

{ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID14, SEQ ID64 

(B) LOCATION: l.,18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 
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ATACCAGGGA CTTTAGTG 



18 



(2) INFORMATION FOR SEQ ID NO; 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream ampiif icat ion priiner for SEQ ID15, SEQ IDG5 

(B) LOCATION: i.,19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 



AGATTCAGAC CAATTTCAC 19 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

CA} LENGTH: 18 base pairs • " 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID16, SEQ ID66 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 



TGCTTTGATT TGACCCTG 18 



12) INFORMATION FOR SEQ ID NO: 117: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

{ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification orimer for SEQ ID17, SEQ ID67 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 



GCCTATCTTG TTTTGACTG 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

If 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification orimer for SEQ IDi8, SEQ ID6^ 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 



TTCAGAGCAA CAATTTTGG 



(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME/KEY; upstream amplification primer for SEQ ID19, SEQ ID69 

(B) LOCATION: 1,.20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 



CCAAGTTTAT GAGATTAGAG 



(2) INFORMATION FOR SEQ ID NO: 120: 

{i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
{B) TYPE: NUCLEIC ACID 
(C) STRANDEDNESS: SINGLE 
(0} TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification orimer for SEQ ID20, SEQ ID70 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 



CTA.^CCTAGA TGATCTTCC 



(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ 1021, SEQ ID71 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 



TGTCCCAAGT TTAGTTCC 



18 
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(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID22, SEQ ID7'-' 

(B) LOCATION: 1. .21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 



CCAGGAATAA TACTTTGCAT C 



(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR . ' 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID23, SEQ ID73 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 



CTCAGTTTTT CTTTCCACC 



(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY; upstream amplification primer for SEQ ID2^, SEC ID74 

(B) LOCATION: i..20 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12^: 
GACTCAGGCA CAACTTTTAG on 



(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID25, SEO ID75 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 



TACAGCAATG GTATAAAGC 



(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID26, SEQ ID76 
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(B) LOCATION: 1..20 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

TTATCCATCA TTTAGAAGGC 20 

(2) INFORHATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A} LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID27, SEQ ID77 

(B) LOCATION: 1..18 

{xi} SEQUENCE DESCRIPTION: SEQ ID NO: 127: 
CACTGGAGAT AGCTGAAC IS 

(2) INFORMATION FOR SEQ ID NO: 128: ^ . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID28, SEQ ID78 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 



GTACTGTCAA ATCATCACC 



19 
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{2} INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

{B} TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE; DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{ix) FEATURE: 

{A) NAME/KEY: upstream amplification primer for SEQ ID29, SEQ ID79 
{B) LOCATION: l.aS 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 



CGGGCATAAA AATGCAGG 18 



(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE; NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID30; SEQ IDBO 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 



GTATATGTGA AGGTTGTGGG 20 



(2) INFORMATION FOR SEQ ID NO: 131: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification orimer for SEQ ID31, SEO ID81 

(B) LOCATION: 1..19 

(Ki) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 



GTAAGATCTG ACTTGCTCC 



(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 
(CJ STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for 5EQ ID32, SEQ ID82 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 



CCAGCTTGAA TTTTGGTGAG 



(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID33, SEQ ID83 

(B) LOCATION; 1..18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 



wo 99/04038 





PCT/IB98/01193 



81 



GCATATCTTG GTGGTCTG 



18 



(2) INFORMATION FOR SEQ ID NO; 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE; 

(A) NAME/KEY: upstream amplification primer for SEQ ID3^, SEQ IDo-1 

(B) LOCATION: 1. . 19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 



AGGGTTCAAA GGAAGGAGG 19 



{2] INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs , - 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID35, SEQ IDSS 

(B) LOCATION: 1. .20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 



GAAAAAGAAG GGAAAGAAAG 20 



(2) INFORMATION FOR SEQ ID NO: 136: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID3G, SEQ ID8b 

(B) LOCATION: 1,.19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 



GTTTGTCTTG GCTATTAAG 19 



(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID37, SEQ IDBl 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 



TGAAAAAGTG GGTAGCAG 18 



(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

[h) NAME/KEY: upstream amplification primer for SEQ ID38, SEQ ID88 
(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 



(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(Bl TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID39, SEQ ID89 

(B) LOCATION: l.,18 

[xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 



GGAAGAGGGC AACTTTAC 1^3 



(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi] ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID40, SEQ ID90 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 



ATATCAGGGC AGGCACAAG 



19 



TGAAATGGGC TGTAGATG 



18 
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(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

{B} TVPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID^l, SEQ ID91 

(B) LOCATION: 1..18 

(;<i) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 



TTAAACCTTG GCTTCCTG 



(2) INFORMATION FOR SEQ ID NO: 142: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 
IC) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR • " 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID42, SEQ ID92 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 



TTCAACCTTT TGTCGCTG 



(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS; SINGLE 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE; 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for 5EQ 10^3, i^EQ ID93 

(B) LOCATION: 1..19 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: lO: 
ATGTAACAGA TGTCCAAAG iq 



{2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE; NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM; Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID44, SEQ ID94 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 
CTAAGGGTCT TCTTTCTG i o 



(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID4 5, SEQ ID95 
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(B) LOCATION: 1..19 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

GGTGTATTTA GGTTTGTGG 



(2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

{A) NAME/KEY: upstream amplification primer for SEQ ID46, SEQ ID9d 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: IAS: 
CTACCATCAC TTTCCTCC i q 



(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE; 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID47, SEQ ID97 

(B) LOCATION: 1 . . 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 



ATAACTAGGC ATCCAGAC 



18 
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(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID^Q, SEQ ID98 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 
CGACATAATT TGGTATGTAG 2 0 

(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID49, SEQ ID99 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 
TCACCAAGTG TCATCGTC 18 

(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{ix} FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID50, SEQ 

IDIOO 

(B) LOCATION: 1, . 19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

GAGACTTTGT AACTTTGTG 19 



(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TVPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID51 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ IDl, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

GTCTTCATAA GTCTTCAGTG 20 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID2, SEQ 
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(B) LOCATION: 1_18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

CAAAACACTC CCTCACAC 18 

{2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID53 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID3, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 



CAGGTGATGT CTGGATAC 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID54 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 



AAGACAACAA GAACTAAATC C 



21 
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(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID55 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEO IDS, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

TCCCCAATAG ATTAAAGTTC 2 0 

(2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR ^ . 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID56 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID6, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

CTGAGCATCA AATAGGAG 18 
(2) INFORMATION FOR SEQ ID NO: 157: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vil ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) rEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ IDl , SEQ 

(B) LOCATION: 1..21 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 



TCATTACAGA AAAAGCCAAA G 21 



(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 20 base pairs 

(B) TYPE: WUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: ^ - 

{A} NAME/KEY: downstream amplification primer for SEQ ID8, GEO 

ID58 

(E) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

TCCTTCTCCA CCTAAAATTC 20 



(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: IB base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) 



FEATURE: 

{A} NAME/KEY: downstream amplification primer for SZQ ID9, SEQ 



ID59 



(B) LOCATION: 1..18 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 159: 



ACTGCTTCTG 



CTCTCTTG 



16 



(2) INFORMATION FOR SEQ ID NO: 160: 



(i) SEQUENCE CHARACTERISTICS: 

{A] LENGTH: 20 base pairs 
(B) TYPE: NUCLEIC ACID 
iC) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ IDIO, SEQ 



(B) LOCATION: 1..20 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

TGAACATACA P^J-J^JKChCTGG 



(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ IDll, SEQ 

ID61 

(B) LOCATION: 1..18 



ID60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 
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AGAGTTGTTG GCATGTAG 19 



(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS; SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID62 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID12, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 

AACTGCTCAG CAACTGTG iq 



(2} INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID63 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID13, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163; 

TTAGAACACT TTTATGGGAA C 21 



(2) INFORMATION FOR SEQ ID NO: 164: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID14, SEQ 



(B) LOCATION: 1..19 
(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 



GTCCTAGAAT GAGCAAATG 



(2) INFORMATION FOR SEQ ID NO: 165: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID15, SEQ 



(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 



AGAGAAAGAA CCAGAGCC 



(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



ID64 



ID65 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

{A) NAME/KEY: downstream amplification primer for SEQ ID16, SEQ 

ID66 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

TGGAGTCTAA ACTAGGTG IB 



(2) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE; NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
CD) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID67 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ IDI7, SEQ 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 

GGACCTTTTA AGAGTGTG 18 



(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 
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ID68 



(A) NAME/KEY: downstream amplification primer for SEQ ID18, SEQ 



(B) LOCATION; I.. 19 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

TGGTTTCTTC AAACAAGAG 



{2} INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID19, SEQ 



(B) LOCATION: 1..21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 



AAGTTGGATA ACCTTCTTTT G 



(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID20, SEQ 

ID70 

(B) LOCATION: 1..19 



IDG9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170; 
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TAGTTTCGTG AACTTATCC 



(2) INFORMATION FOR SEQ ID NO: 171; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE; NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID71 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID21, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 171: 

GTTTACATTA TGCCCCTTTT C 21 



(2) INFORMATION FOR SEQ ID NO: 172: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID22. SEO 
ID72 V , w 

(B) LOCATION: 1..18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 



CTCCACTGCC ACAACTTC 



(2) INFORMATION FOR SEQ ID NO: 173: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D} TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(iK) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEO ID23, SEQ 



(B) LOCATION: 1..21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 



TGCTCTGCTT GTAATGTTAT G 



(2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens , - 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID24, SEQ 



(B) LOCATION: l.,19 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 



CAAGGTTGCC AGTCACATC 



(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



ID73 



ID74 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(iK) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID25, SEQ 

ID75 

(B) LOCATION: 1,,18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 

ATGAAGATAC GCAGCCAG 1q 



(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID26, SEQ 

(B) LOCATION: 1 . .21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: , - 



CTCATTTAAC TCCCATTCCT C 



{2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

FEATURE : 

(A) NAME/KEY: downstream amplification primer for SEQ ID27, SEQ 

(B) LOCATION: 1..21 



(ix) 

ID77 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 



TGCTTTTCTT GTCCCTGATT G 



21 



{2} INFORMATION FOR SEQ ID NO: 178: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID28, SEQ 



(B) LOCATION: 1 . . 20 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 



GCATTGAATC CGTAAATTTC 



(2) INFORMATION FOR SEQ ID NO: 17 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID29, SEQ 



ID78 



ID79 



(B) LOCATION: 1,.21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 



CAGTTTTGGT CATTGTGGGA G 



21 
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(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY; LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID30, SEQ 



(B) LOCATION: 1..21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 180: 



AAATCCAACT ATGTCACTTC C 



(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

{A} LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID31, SEQ 

ID81 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 



AATGTCCCCT CCTCCTCTG 



(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 



ID80 
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(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primor for SEQ ID32, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 



GCCACAAGTA TTTGGGTGCC 



(2) INFORMATION FOR SEQ ID NO: 183: 

(i) SEQUENCE CHARACTERISTICS: 

{A} LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer^for SEQ ID33, SEQ 



(B) LOCATION: 1..19 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 



CCTACGGTTT GTCATAAAG 



(2) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



ID83 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

{A} NAME/KEY: downstream amplification primer for SEQ SEQ 

ID84 

(B) LOCATION: 1,,21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: IB4: 

TGTAACAGGG GACATGGGAA G 21 



(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID85 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID35, SEQ 

(B) LOCATION; 1, .20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 

CAATTTTGTA TGGATGACAG . " 20 

(2) INFORMATION FOR SEQ ID NO: 18 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID86 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID36, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 
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TGGTGGTGGA AAAAAAGAAG G 



(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 21 base pairs 

{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID37, SEQ 



(B) LOCATION: l.,21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

CTATA^^CTCT TATCAGTGAA C 



(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs , - 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID38, SEQ 



IDB7 



ID88 



(B) LOCATION: 1. .20 



(Ki) 



SEQUENCE DESCRIPTION: SEQ ID NO: 188: 



AGGTCACTCA AGTATTATGG 



20 



(2) INFORMATION FOR SEQ ID NO: 189: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID39, SEQ 



(B) LOCATION: 1,.21 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 



CCCCAGCTCC CAAATAATGA C 



(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 
{C} STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID40, SEQ 



(B) LOCATION: 1..20 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 



TCCACAACAG ACACTTAAAC 



(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



ID89 



ID90 



(ii) MOLECULE TYPE: DNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID41, SEO 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

TCTCTTTCCC CATCTCTC X8 



(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID42, SEQ 

(B) LOCATION: 1..19 

It 

SEQUENCE DESCRIPTION: SEQ ID NO: 192: 



TCCCCTTCTA TTGTCTACC 



(IX) 



ID92 



(xi) 



(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID93 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4 3, SEQ 
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(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

GGTTTGTGTT CAGTACGG 



(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID94 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4 4, SEQ 

(B) LOCATION: 1..21 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 
TGTATATGCC TGGTGGAAAT G 21 

(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID95 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID45, SEQ 

(B) LOCATION: 1..21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 



GTGAAAGAAA CTTGATAGAG G 



21 
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(2) INFORMATION FOR SZQ ID NO: 196: 

(i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



ID96 



(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID4 6, SEQ 

(B) LOCATION: 1..18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 
CCTCCAACAG TAAGAATC ig 



(2) INFORMATION FOR SEQ ID NO: 197: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR , - 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID47, SEQ 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

CAGAACCATT AACTATTCAC on 



(ix) 

ID97 



(2) INFORMATION FOR SEQ ID NO: 198: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID48, SEQ 

ID9G 

(B) LOCATION: 1..20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 

GCCATTTGGA ATTTTGATAG 20 



{2} INFORMATION FOR SEQ ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: < " 

(A) NAME/KEY: downstream amplification primer for SEQ ID4 9, SEQ 

ID99 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 



TGCAGCATCC CTGGAAGTC 



(2) INFORMATION FOR SEQ ID NO: 200: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
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(ix) FEATURE: 

(A) NAME/KEY; downstream amplification primer for SEQ ID50, SEQ 

IDIOO 

(B) LOCATION: 1. .21 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 



GAGACATCAT ATCTGTGTTT G 



(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2 103-270. misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 



GATTCATATG AGACAGCTA 



(2} INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2228-301 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 



wo 99/04038 



111 



PCT/IB98/01193 



CCCTGCTTAT CCCTGTAAGG TGG 



(2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

iix) FEATURE: 

(A) NAME/KEY: potential microsequencing oiigo 99-2229-240 .misl 

(B) LOCATION: 1,.23 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 203: 



TCGTCATCGT GGCCTGGGCT ACA 



(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE • " 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2240-28 1 .misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 



TCTTAATAAC TTTTTATTT 



(2) INFORMATION FOR SEQ ID NO: 205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-22^2-206 , misl 

(B) LOCATION: l.,19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 



TTTCTTTTAG TCAAATTAT 



(2) INFORMATION FOR SEQ ID NO: 206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing olic[o -99-2244-83 .misl 

(B) LOCATION: 1..23 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 206: 



TAATTGTAGA TACTAAGACC ATT 



{2} INFORMATION FOR SEQ ID NO: 207: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 
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(A) NAME/KEY: potential microsequencing oligo 99-2246-340 .misl 
{B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 



(2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-224 8-76 . misl 
{B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 
GAGAGGGAAG GTAATCTTC 



(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2250-236 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 209: 



ATTTATATGT TAAATGCAGA GAA 



23 



TTTTATCCAA AACAGAATTA ACA 



23 
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(2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(Bl TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vil ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2251-151 . misl 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 



TGAAAAGAAG TTCAGACGAT TGC 23 



{2} INFORMATION FOR SEQ ID NO: 211: 

{i] SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA , - 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2269-17 9 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 



AAAATAAAGA AATTCCTAGA GAC 23 



(2) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2271-^03, misl 

(B) LOCATION: l.,23 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212: 
AGGCATTTAT TTCATATTTA TTA 23 



(2) INFORMATION FOR SEQ ID NO: 213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2272-^ 09 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 213: 
AAAAGCACTG CAATTATTTT GGA 23 



(2) INFORMATION FOR SEQ ID NO: 214: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2273-528 .misl 

(B) LOCATION: 1..19 
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{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214: 



GAAGATAAGA AAATCAAGG 



19 



(2) INFORMATION FOR SEQ ID NO: 215: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{0} TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2275-4 66 . misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 
TGATAGCATT AAATACTCC 



(2) INFORMATION FOR SEQ ID NO: 216: 

(i) SEQUENCE CHARACTERISTICS: * ' 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2278-27 6 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 216: 
GAAAAAAATG GGAACATCTT CAC 23 



(2) INFORMATION FOR SEQ ID NO: 217: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2312-358 .misl 

(B) LOCATION: l.,23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 217: 



TTTTAGAGAG AGATGGAAAA AAA 23 



(2) INFORMATION FOR SEQ ID NO: 218: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2315-2 13 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 218: 



AGATGGATTC TACCCACAGG CAA 23 



(2) INFORMATION FOR SEQ ID NO: 219: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: KUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2320-292 . misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 



TCATTCACTA AACTTCAAC 



(2) INFORMATION FOR SEQ ID NO: 220: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
CD) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-232 1-8 2 . misl 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 220: 



GCTTACTGAG TGTCCACTC 



(2) INFORMATION FOR SEQ ID NO: 221: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-2324-338 .misl 

(B) LOCATION: l.,I9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 
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AGAAGACAAA ATCGCAGGA 



(2) INTORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

{ii} MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2333-423 .misl 

(B) LOCATION: l.,23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 222: 
GAGACGCTAT CTATGCAAGG AGG 03 



(2) INFORMATION FOR SEQ ID NO: 223: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-234 1-4 85 .misl 

(B) LOCATION: 1, .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 
TTTTATCTGT CTGTTACCTA CAC 



(2) INFORMATION FOR SEQ ID NO: 224: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 
{D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2342-217 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224: 



TTTTGCCTTG GACTTTCATG ACA 



(2) INFORMATION FOR SEQ ID NO: 225: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

{ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequenciag oligo 99-2362-270 .misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225: 



TCTCTCTTGG GTGGTTCCTC AAC 



(2) INFORMATION FOR SEQ ID NO: 226: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 
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(A) NAME/KEY: microsequencing oligo 99-2364 -329 , niis 1 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTrON: SEQ ID NO: 226: 



aaaatc;atc;a ac:c:atatac 



{■A) IHKOKMATION FOK SEO ID NO: 227: 

(L) iJEOUENCE characteri;-;tici^: 

(A) LENGTii: 23 base pair^"; 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA 

(VI ) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{!:.:) FEATURE: 

(A) NAME/KEY: potential microsequencing oli^jo 99-23b7 - G 1 . ini:J 1 
(H) LOCATION: 1..23 

(:.:i) SEQUENCE DESCRIPTION: SEQ ID NO: 227 : 



TA.^v.ACATTTC ATTATTTCAG AAA 



[::] INFORMATION FOR SEO ID NO: 228: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
ID) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A} ORGANISM: Homo sapiens 

(i:<) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-237 1-93 . misl 
(3) LOCATION: 1..23 

(xi) SEQUEI^CE DESCRIPTION: SEQ ID NO: 228: 



TTTTAAACTT TCCTAATACT TAG 



23 
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iZ] INFORMATION FOR SEQ ID NO: 229: 

(i) SEQUENCE CHAFIACTERIGTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESr;: SINGLE 
{DJ TOPOLOGY: LINEAR 

{ i i ) MOLECULE TYPE: DNA 

(vi ) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapicM-i^i 

{i>:} FEATURE: 

(A) NAME/KEY: potential inicrosequencino oli.jo ')^)-2 378-::^00 , jtiii:! 

(B) LOCATION: 1..23 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 



{;agaacttcc tgttgaacct gtt 



(2) INFORMATION FOR SEQ I D NO : 2 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS; SINGLE 

(D) TOPOLOGY: LINEAR 

{ii} MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencmg oiico 99-238 1-39^ . mis 1 

(B) LOCATION: l.,23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 



AGTGGTCTTC AGGTTATTGG TAG 



(2) INFORMATION FOR SEQ ID NO: 231: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANIGM: liomo sapiens 

(i;-:} FEATURE: 

(A) NAME/KEY: potential micro^^oqucnc intj oli<jo *)')-2'l M- 1 . m i :j 1 
(U) LOCATION: 1..23 

{:<L) :;K0UKNCE DE^^CRIPTTON: iVziQ ID NO: 2J1: 



attttaac;ag c^aa^^acttaa tog 



(2) INFORMATION FOR SEQ ID NO: 232: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
(H) TYPE: NUCLEIC ACID 

(C) STRANDEDNEGS: SINGLE 

(D) TOrOLOGY: LINEAR 

{ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: rnicroseqaencincj oligc 0<')-2.; : \*-2H S . mi ;> 1 

(B) LOCATION: 1..19 

{:•:!) IlEOUENCE DESCRIPTION: SEQ ID NO: 232: , - 



GATCAAGCAG TGCCCACTC 



(2) INFORMATION FOR SEQ ID NO: 233: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

{B] TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2559-253 .misl 

(B) LOCATION: 1..23 
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(xi) SiDQUENCE DESCRIPTION: SEQ ID NO: 233: 



CAGGTGTTTT CATGCCCTCT TAG 



23 



(::) !^3KOKMATlON FOR SEQ ID NO: 23*1: 

(i) IJKUUENCE CHARACTERISTICS: 

(A) LENGTH: 23 bat;e pairs 

in) TYPE: NUCLEIC ACID 

[C] STRANDEDNESS: SINGLE 

(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi ) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential mi crosequerici ng oii.jo ^.)9- 2 Ij b b- 1 L 2 . mi:; 1 

(B) LOCATION: 1..2 3 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 234: 
GCCTTCACAA CCGCAGAGGC AAG 2 3 



(2) INFORMATION FOP SEQ ID NO: 235: 

(L) SEQUENCE CHARACTERISTICS: , " 

(A) LENGTH: 23 base pairs 
(F^) TYPE: NUCLEIC ACID 
(C} STR.ANDEDNE5S; SINGLE 
{D} TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencmg oligo 99-2567-329 . misl 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 235: 
CACTGTCAGA TATGAAATGA TGC 23 



(2) INFORMATION FOR SEQ ID NO: 236: 
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{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNE55: SINGLE 
{H) TOPOLOGY: LINEAR 

( i i ) MOLECULE TYPE: DNA 

(vL) ORIGINAL SOURCE: 

(A) ORGANISM: Homo napicns 

(.i:<) KKATURE: 

(A} NAME/KEY: microiioquoncinq oiiqo ^)^i-;^S7()-:M iJ . miii 1 
{U] LOCATION: 

(:<i) .SEQUENCE DESCRIPTION: SEQ ID NO: 23G: 



AGTTCCAAAT TATGAGAAG 



(U) INFORMATION FOR SEQ ID KG; 237: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA 

(vi) (ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens ^ . 

(!:•:} FEATURE: 

(A) NAME/KEY: potential nicroseauencmr) olicjo 99-2 57 1 -2^ 2 . mis i 
(Bl LOCATION: 1 . .23 

(;-:i) SEQUENCE DESCRIPTION: SEQ ID NO: 237: 



ATAATGAATG AGTATTTGAT ATT 



(2) INFORMATION FOR SEQ ID NO: 238: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE; DNA 



(vi) ORIGINAL SOURCE: 
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{A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME/KEY: microseqaoncing oiigo b 1 0- 1 2 1 . ni 

(D) LOCATION: i . .23 

itKOUENCE DEGCRirTlON: r.ZQ ID NO: 238: 
TTTTCCCTTC C:CTAGGTATG GCT 2 ^ 

(2) LN FORMAT I ON FOR SEQ ID NO: 239; 

(i) SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Houio sapiens 

(i>:) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2 6 i 5- 8 3 . ::iis 1 

(B) LOCATION: 1 . .23 

SEQUENCE DESCRIPTION: SEQ ID NO: 239: 
TTTTAATCAC AAGTTGGCAA AAG * ' 2 3 

(2} INFORMATION FOR SEQ ID NO: 2^10: 

(i) SEQUENCE CHARACTERISTICS: 

{A} LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

{C} STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2620-227 .mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 240: 
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TTGACTGGGC TCCTGATGTG TCC 2 3 



C:) INFORMATION FOR SCQ ID NO; 241: 

(i) riFOnf^NCE CHARACTElRI.'^TICn: 

(A) LENGTH; 23 b.xr.o paira 
(H) TYPE: NUCLEIC ACID 

(C) strandedne::::: ingle 

(D) TOrOLOGY: LINEAR 

(ii ) MOLECULE TYPE: DNA 

(vi ) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Lx) FEATURE: 

(A) NAME/KEY: potential microsequenciruj oli.}0 99-262^ -407 .mis 1 

(B) LOCATION: 1..2 3 

(:•:!} SEQUENCE DESCRIPTION: SEQ ID NO: 2^1: 



ATCTGCCCAT AGGCAGAACA TTG 23 



(2) INFOF<MATION FOR SEQ ID NO: 242: 

(i} SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

{A) NAME/KEY: potential microseauencmg oligo 99-2 62 5-7 0 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: 
AGTGACTCAA CCAGAAAGAG AGC 2 3 



(2) INFORMATION FOR SEQ ID NO: 243: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 



wo 99/04038 



123 



PCT/IB98/01193 



(C) STRANDEDNESS: SINGLE 
iV) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vL) ORIGINAL SOURCE: 

(A) 0RC;ANIGM: Homo :^apien:^ 

( ix) KKATURE: 

(A) NAME/KEY: potential micro:}eqiu;nci luj oli.jo [)')-2u }0-i'>'I ,mLy, i 
in] LOCATION: i..23 

(xi) ::KOtIENCE DEGCRIPTION: 1>E0 10 NO: 
TA7VATTCTGC CTAGAAGATT AAG 2 3 

CO INFORMATION FOR SEQ ID NO: 2-\A: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

TYPE: NUCLEIC ACID 
iC) STRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(ii) KOLECULE TYPE: DNA 

(vL) ORIGINAL SOURCE: 

:A} ORGANISM: Homo sapiens 

(L:-:) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 9^)-2t. 3 3- 1 2*9 . :uin 1 
{B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2-14: 
TTTTTATTTC TTCCCCTAGG CAA 2 3 

(2) INFORMATION FOR SEQ ID NO: 245: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

{vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 
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(A) NAME/KEY: potential microsequcncing oligo 99-2634 -3^ 1 . misl 

(B) LOCATION: 1..23 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 245: 



CC^AATCAATA TTTATTTATT ATC 



{2} INl'ORMATION FOR SEQ ID NO: 2^6: 

(i) rJEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(U) MOLECULE TYPE: DNA 

(vi ) ORIGINAL SOURCE: 

(A) ORGANISM: Hcrno ::apiens 

(i:-:) FEATURE: 

(A) NAME/KEY: potential inicrosequGncir. ] oiigo 99-2637-.!y . mit; I 

(B) LOCATION: 1..23 

{:<!) SEQUENCE DESCRIPTION: SEQ ID NO: 2AG: 



CC'ATCACTTC CTCCTAGTGA AAA 



(2) INFORMATION FOR SEQ ID NO: 2^17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencmg oligo 99-2 6/1 2-255 . mis 1 

(B) LOCATION; 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 247: 



TGAGGGTGTT TCCAGAAGAG ACT 



23 



wo 99/04038 




PCT/IB98/01193 



130 



(2) INFORMATION FOR SEQ ID NO: 248: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

{B} TYPE: NUCLEIC ACID 

((') STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vt } ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapioru? 

{i:<) FEATURE: 

(A) NAME/KEY: potential microsequencinq oliqo 1 0 . mi:: 1 

(B) LOCATION: 1,.23 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 248: 



CACAAATTAA TTGCATTGTT ATA 2 3 



(2) INFORMATION FOR SEQ ID NO: 249: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA , - 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM* Homo sapiens 

{ix) FEATURE: 

{A} NAME/KEY: potential microsequencmg oligo 99-264 7-368 . mis 1 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 249: 



TTAAGGCCTT CAACTGATTA GAC 2 3 



(2) INFORMATION FOR SEQ ID NO: 250: 

(i) SEQUENCE CHARACTERISTICS: 

{A} LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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{ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: iJomo sapiens 

{.ix) FEATURE: 

(A) NAME/KEY: micror.cqtaoncincj oiic;o nq-;^,.; 107 . mi :a 

(B) LOCATION: 1..!^) 

(>;i) SEQUENCE DESCRIPTION: SEQ ID NO: 2^10: 



actctc;(*;a( ;c cttttatga 



(2) information for SEO id NO: 251: 

(i) SEQUENCE CiiARACTERISTICS : 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
[0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i;-:) FEATURE: 

(A) NAME/KEY: potential microscquenci lu j ol:|o ^9-2 1 0 "^-^r/O . ini^2 

(B) LOCATION: I.. 23 

SEQUENCE DESCRIPTION: SEQ ID NO: 251: 



AGTGTAGAAA A.ATTGAAGGT CTG 



(2) INFORMATION FOR SEQ ID NO: 252: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-222S-301 .mis2 

(B) LOCATION: 1..19 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252: 
GGCCTTGCCC ATATGGGTC 



{::) tN FORMAT I ON FOR SEQ ID NO: 253: 

(l) .SEOUENCE CHARACTERIGTICf>: 

(A) LENGTH: 19 basic pairs 
(U) TYTE: NUCLEIC ACID 
(C) STRANDEDNES:^ : l^^INGLE 
{0} TOPOLOGY: LINEAR 

(il) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo so.piens 

FEATURE: 

(A) NAME/KEY: inicrosequencing olicjo 9')-22J9-2'l 0 .mi:j2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 253: 
AAGGACTGGA ACAGGTAGT 



(2) INFORMATION FOR SEQ ID NO: 25-1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-224 0-28 1 , mis2 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 254: 
AGAAAAAAAA GATTCGAATT ACT y-i 



(2) INFORMATION FOR SEQ ID NO: 255: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESf^: SINGLE 

(D) TOrOLOGY: LINEAR 

( i i) MOLECULE TYPE: DNA 

(vL) ORtr.lNAL SOURCE: 

(A) ORGANISM: Homo i>apicn;3 

{Ix) FEATURE: 

(A) NAME/KEY: potciitiai microiioquoricinq olicjo ')'^^-22']:i-y,0b .mi :\2 
(IM LOCATION: L . .23 

(:•:!) SEQUENCE DESCRIPTION: SEQ ID NO: 255: 



CTTAAGAAA.^ AAGTAAAATA TAA 



CM INFORMATION FOR SEQ ID NO: 256: 

(i} SEQUENCE CHARACTERISTICS: 

(A) LENGTfl: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: [lomo sapiens ^ . 

(i:-:) FEATURE: 

(A) NAME/KEY: microsequencing oligo 9')-22 m -8 3 . nus2 

(B) LOCATION: 1 . . 19 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 256: 



TACCTACATG GTTTAAGCA 



(2) INFORMATION FOR SEQ ID NO: 257: 

(i) SEQUENCE CHARACTExRISTICS : 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
{C) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
iix) FEATURE: 

(A) NAME/KEY: microsequencinq oligo 99-22'l 0-3^1 0 .inlr>2 
(H) LOCATION: 

{:<i) :;E0UENCE DEGCRILTION: CEQ ID NO: 257: 
T{;CAAAACTT ATTTTTCTT 10 



(::) IN FORMAT i ON FOR SEQ ID NO: 258: 

(i) i::EonENCE characteristics: 

(A) LENGTH: 23 ba.se pairs 

(!^) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(0} TOPOLOGY: LINEAR 

(li) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: fiomo sapiens 

(1:-:) FEATURE: 

(A) NAME/KEY: potential microscquencmq ol:qo 9^)-22 '1 13-7 G , mis2 
{U) LOCATION: l.,23 

{:-:i) SEQUENCE DESCRIPTION: SEQ ID NO: 258: 
CCAGGGGATG GGCAGACTTC AGG . ^ 2 .J 



2) INFORMATION FOR SEQ ID NO: 259: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(3) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2250-236 . mis2 

(B) LOCATION: 1..23 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 259: 
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AAT.WVAATA AAAAACCCAA AGT 



(::} INFORMATION FOR SEQ ID NO: 2 60: 

(i) SKOnCNCE CilARACTERIin'IC;.":: 

{A} LENGTH: 2 3 ba.sc p.iirs 
(B) TYPE: NUCLEIC ACID 

(c) :;trandedneg::: imngle 

(I)) TOPOLOGY: LINEAR 

(i L) MOLECULE TYPE: UNA 

(vL) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{!:<) FEATURE: 

(A) NAME/KEY: microsequencing olicjo 99-?25 i - 1 !3 1 . ini52 

(B) LOCATION: 1, .23 

(xi) :jEOUENCE DESCRIPTION: SEQ ID NO: 2G0: 
TT7TACAGCC A.A.ACTAGTCT ATC 



(2) INFORMATION FOR SEQ ID NO: 261: 

(i) .SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B} TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE • " 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2269-179 .mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 261: 
TTGATCTTGA TAGGCTGTA 



(2) INFORMATION FOR SEQ ID NO: 262; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY; LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo :;apicn'^ 

(ix) FEATURE: 

(A) NAME/KEY: micro:u:qiUMiciiHj oiiqo -22'I ,n\i ii2 

iU) LOCATION: 

{>:i) ilKOUENCE DEnCRI TTI ON : SEO ID NO: 261': 



GAAGATAACA AAATCAACG 1 '.j 



(2) INFORMATION FOR SEQ ID NO: 263: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA 

(vi ) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sopiens 

FEATURE: 

(A) NAME/KEY: microsequencing oligo 09-2272-^ 09". rnis2 
{B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 253: 



TTTTACTTGC AATATTTCAC AGT 



(2) INFORMATION FOR SEQ ID NO: 264: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D} TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 
(vi) ORIGINAL SOURCE: 



(A) ORGANISM: Homo sapiens 



(ix) FEATURE: 
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(A) NAME/KEY: potential microseauencing oligo 99-227 3-528 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DE.GCRI PTION : SEQ ID NO: 261: 



AC^iGCATTTAT TTCATATTTA TTA 



CM I NKOKMATION FOR SEO ID NO: 265: 

(L) -nconENCE CHARACTERISTICS: 

(A) LENGTH: 23 b.i^e pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-;) FEATURE: 

(A) NAME/KEY: potontial microsoa'aonciriq oli-jo 99-227 1^-/] GG . miij2 

(B) LOCATION: 1..23 

SEQUENCE DESCRIPTION: SEQ ID NO: 265: 



TAGTATCCCT ATTCACAGTT TTT 



(2) INFORMATION FOR SEQ ID NO: 26 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

{B} TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

{A} NAME/KEY: microsequencing oligo 99-2278-27 6 . mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 266: 



TTGTTGGAGA TGCACAGGC 



19 
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(2) INFORMATION FOR SEQ ID NO: 2G7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(U) TYPE: NUCLEIC ACID 
(C) STRANDEDNE5S: .SINGLE 
(U) TOPOLOGY: LINEAR 

( i i ) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo :j:ipiens 

(i:-:) FEATURE: 

(A) NAME/KEY: potential microscquencinq oliqo 'J 9-231 2-3S^K iuii^2 
{B) LOCATION: 1,.23 

(:*:i) SEQUENCE DESCRIPTION: SEQ ID NO: 267: 



Gc;CCATTTAC CCAGAAGGCC TAC 



{.:) INFORMATION FOR SEQ ID NO: 2GB: 

ii) SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 
iC) STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA ^ . 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-2315-2 13 .mis2 

(B) LOCATION: l.,23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 268: 



TTTTTTTTAA AATAAGGTTT TCT 



(2) INFORMATION FOR SEQ ID NO: 269: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

{vi} ORIGINAL SOURCE: 

(A) ORGANISM: iloino sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potontiol microscqvionci iiq oLif]o [)^)-2320-:y)2 .mi y>:: 
Cn) LOCATION: 1..23 

r.EOUENCE OESCRTmoN: SEQ 10 NO: 2(59: 



AAATTCIATTA AATTTATAAA AAC 



{2) INFORMATION FOR SEO ID NO: 21Q: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNEGS: SINGLE 
{[)) TOPOLOGY: LINEAR 

(ii ) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A} ORGANISM: ilomo sapiens 

(i:<) FEATURE: 

(A) NAME/KEY: potential microsoouonci:. ; oli^jo 40-2 3^ 1-82 . [nis2 
{[M LOCATION: 1..23 

(:-:i) SEQUENCE DESCRIPTION: SEQ ID NO: 270: . - 



AGGPJ^JVTATT TGAGTAGGTA TCC 



(2} INFORMATION FOR SEQ ID NO: 271: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oiigo 99-2324 -338 . mis2 

(B) LOCATION: 1..23 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 271: 



TTTTTACTGT TGAGGGATTT CTT 



C) INrORMATION TOR SEQ ID NO: 272: 

(L) SEQUENCE CIIARACTHRISTICS: 

(A) LENGTH: 2 3 base pairs 

(B) TYPE: NUCLEIC ACID 
(C} STRANDEDNESS: SINGLE 
[[)) TOPOLOGY: LINEAR 

{it) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(!:•:) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-233 .mi -:2 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 272: 



TTTTGCTGTC CArV^TGTTGA ACA 2 3 



[2} INFORMATICS FOR SEQ ID NO: 273: 

(i) SEQUENCE CflARACTERISTICS: , " 

CA} LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A} NAME/KEY: potential microsequencing oligo 99-234 1-485 .mis2 

(B) LOCATION: l.,23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 3: 



AGAGCCTGTG CGATTCTTTG TAA 23 



(2) INFORMATION FOR SEQ ID NO: 274: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) 5TRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

( i 1} MOLECULE TYPE: DNA 

(V i ) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{{:<) FEATURE: 

(A) NAME/KEY: poLonciai microsoquunci ruf oUvjo *)0-2i-l.^-:n 7 .mi;j2 

(B) LOCATION: I.. 23 

(:•:!} SEQUENCE DESCRIPTION: SEQ ID NO: 21 A : 



CACAATCTGG GCTGTTTCTA GAA 



(2) INFORMATION FOR SEQ ID NO: 275: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTEI: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
CD) TOPOLOGY: LINEAR 

( i i) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens , - 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-2 362-270.^132 

(B) LOCATION: I.. 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 275: 



TTTTACTTGG TCAAGGTCAC ACA 



(2) INFORMATION FOR SEQ ID NO: 27 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME/KEY: potential inicrosequencinq olitjo 99-236-1 - 12^1 , mi.'72 

(B) LOCATION: 1..23 

(xi) SEQUENCE DEGCRIPTION: SEQ ID NO: 270: 
c:aattatatg TTACCTTGCC TCA 

CD INL-'ORMATION for GEO ID NO: 277: 

Ci) SEOUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microscquencing oligo 9'.)-23G7-61 . 

(B) LOCATION: 1..23 

SEnUENCE DESCRIPTION: SEQ ID NO: 277; 
TTTTTTGGTG A-^v,AATGCATA TTA ^ . 2 3 

(2) INFORMATION FOR SEQ ID NO: 278: 

(i] SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE; 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-2 37 1-93 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 278: 
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ATTATGTAAA AAGTAGGCAG TGA 



23 



{P.) TNTORMATION FOR SEQ ID NO: 219: 

(i) i^KOnENCB CHARACTF.RTGTICS: 

(A) LENGTH: I <) pair;; 
(H) TYPK: NUCLBIC ACID 

(C) :.;TRANDli:DNEi.ir»: iUNGLE 

(D) TOPOLOGY: LINEAR 

(jl) MOLECULE TYPE: DNA 

{vi) ORIGINAL SOURCE: 

(A) ORGANISM: ilomo sapiens 

(1:-:) FEATURE: 

(A) NAME/KEY: luicrosequencing olicjo 99-2378-200 . misli 
(D) LOCATION: 1..19 

fJEOUENCE DESCRIPTION: SEQ ID NO: 279; 
GGACGACAGG ACAGTTCTA 



(2) INFORMATION FOR SEQ ID NO: 280: 

(j) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
(13) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-2 38 1-394 . mis2 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 280: 
TTTAGCTCCC CTACTTTTC 



(2} INFORMATION FOR SEQ ID NO: 281: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A) ORGANISM: Homo i-;apicnr> 

(ix) KKATURE: 

(A) NAME/KEY: micro^u-quenciruj oiicjo *)<)-:m M- HuS . lu i :;2 
(H) LOCATION: 

{>;i) :>EOUENCE DESCRIPTION: SEO ID NO: : 



AAATATTATG TACAATTCT 1 



(2) INTOUHATION FOR $EQ ID NO: 282: 

(i) SEOUEHCE CHARACTERISTICS: 

(A) LENGTii: 23 base pairs 

(U) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

ivi ) ORIGINAL SOURCE: 

(A} ORGANISM: Homo sapiens 

{i:-:} FEATURE: 

(A) NAME/KEY: potential microsequoncmq oliqo D9t-2/J 1 9-2^ 5 . nis2 
(D) LOCATION: 1. .23 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 282: 



AGTCACAGCT CCCTGGAGGG TGG 23 



(2) INFORMATION FOR SEQ ID NO: 283: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 



tix) FEATURE: 
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(A) NAME/KEY: inicrosequencing oiigo 99-2559-253 .mis2 

(B) LOCATION: I.. 19 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 283: 



c;a'ih;c;atgat ctcacacac 



(>:) INR)UMATXON FOR SEQ ID NO: 2nA: 

(L) i;i-X)UKNCE CilARACTERISTIC::;: 

(A) LENGTH: 19 base pairt^ 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DMA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Horno sapiens 

(L::) FEATURE: 

(A) NAME/KEY: microsequer.cing oligo 9^-2S^^i- 1 1 2 . mi s2 
(3) LOCATION: 

SEQUENCE DESCRIPTION: SEQ ID NO: 284: 



Ai ;t ;f ;tC:C;cca agctccttc 



(2) INFORMATION FOR SEQ ID NO: 285: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oiigo 99-25 C7-329 . mis2 

(B) LOCATION: 1. .19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 285: 



TATAGCCCAA AGAAAGCCA 19 



# 
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(2) INFORMATION FOR SEQ ID NO: 286: 

{i) SFOUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
(M) TYPE: NUCLEIC ACID 
(C:) STRANDEDNEriJ: SINGLE 
(fU TOPOLOGY: LINEAR 

(LL) MOLECULE TYPE: ONA 

(vL) ORIGINAL SOURCE: 

(A) 0RGANI1»M: liomo :iapitKiJ 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequenciny oliqo ')^l-2S'?0-2 1 8 . mi:j2 

(B) LOCATION: 1 . .23 

i;EOUENCE DESCRIPTION: SEQ ID NO: 286: 



AACTTAGCCA CTTCAGAGGC CTC 



(2) INFORMATION FOR SEQ ID NO: 287: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
(H) TYPE: NUCLEIC ACID 
(('} STRANDEDNESS: SINGLE 
(U) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA " ' 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-257 1-242 . mis2 

(B) LOCATION: 1..19 

{:<!) SEQUENCE DESCRIPTION: SEQ ID NO: 287: 



GCTGACACAT TTAATTATA 



(2) INFORMATION FOR SEQ ID NO: 288: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) orUGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{ Lx) FEATURE : 

(A) NAME/KEY: potKMiti.ii micro:-;cqiion.' ir^i oIkjo \2l ,mir/A 

(H) LOCATION: I., 23 

(XL) i":EOUb:NCE DESCRITTION: SEQ ID NO: 2B8: 
CIAGAATTTTC TAAGTCCAGC ATA 2 i 



(2) INFORMATION FOR SEQ ID NO: 289: 

{1} i;EOUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(0) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Lx) FEATURE: 

(A) NAME/KEY: potential microsoquen : i n ] oli^io ^)0-2 G -H 3 . mi:;2 
{B) LOCATION: 1..23 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 289; ^ < 

TTCCCAGAA.G ATGAGAATTT GOT 2 3 



(2) INFORMATION FOR SEQ ID NO: 290: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2620-227 . mis2 

(B) LOCATION: 1..23 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 290: 
TTTT.^ACAGC CAGCAAGATA CCC 



C^) [NRMnMATION for GEO ID NO: 291: 

(i) i;kquence characteristics, : 

(A) LENGTi!: 23 base pairs 
(H) TYPE: NUCLEIC ACID 

[C] STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: flomo sapiens 

(i:*:) FEATURE: 

[A) NAME/KEY: microsequencing oliyo ^9-2024 --1 07 . mi:;:.' 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 291: 
TTTTCTCTTT CCCCATCTCT CCC 



(23 [NFORMATIOIJ FOR GEO ID NO: 292: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2625-70 . mis2 

(B) LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 292: 
TTTTCTCTCT TCKTCCTCTC TCC 



(2) INFORMATION FOR SEQ ID NO: 293: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

{ i L) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sap^iicns 

{i:<} FEATURE: 

(A) NAME/KEY: microsoquoncincj oiicjo '>*.)-2 b m')-()7 . mi.s2 

(B) LOCATION: 1_23 

SEQUENCE DESCRIPTION: SEQ ID NO: 293: 



TTTTACTCCC TGTTCTGGAC CAA 2 3 



(2) INFORMATION FOR SEQ ID NO: 294: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

( Li) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i;-;} FEATURE: 

(A) NAME/KEY: potential microsequencing oiicjo 99-2633- 12 9 . tnis2 

(B) LOCATION: 1..23 

{xi} SEQUENCE DESCRIPTION: SEQ ID NO: 29^: 



TCAAGGGTTC TCTCATTGTC TAC 23 



(2) INFORMATION FOR SEQ ID NO: 295: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM; Homo sapiens 
(ix) feature:; 

(A) NAME/KEY: microsequcncing oiicjo ^n-2v>3 1-3-11 .mip:- 

m LOCATION: 1..23 

(xi) SEQUENCE DESCRIPTION: SEO ID NO: 29^: 



TTTTTAAATA ATGTCTCACC TGT 2 



{2) INFORMATION FOR 5EQ ID NO: 2 96: 

(i) SEOUEKCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{D) TOrOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: inicrosequencing olicjo 99-2 C37 -2B . miijli 

(B) LOCATION: 1..23 

(xil SEQUENCE DEGCRIPTION: 5EQ ID NO: 296: 



TTTTA7WVCC CACCCTCCTT TGA . " 23 



(2) INFORMATION FOR SEQ ID NO: 297: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
(B} TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

{ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2 64 2-255 . mis2 

(B) LOCATION: l.,19 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 297: 
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TCACTTCAGA TTCAAATGC 



{2) INFORMATION FOR SEQ ID NO: 298: 

(i) GEOtlENCE CHARACTKRISTICS: 

(A) LENGTH: 23 base pnirs 

(B) TYVE: NUCLEIC ACID 

(C) i^TRANDEDNESr^: SINGLE 
(U) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: UNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: tlomo sapiens 

(ix) FEATURE: 

[A] NAME/KEY: rnicrosequencing oligo 9^>-2 5- 11 8 . inis2 
{Bl LOCATION: 1..23 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 29B: 



TTTTCAGATT CTTCATTGCT AGC 



(2) INFORMATION FOR SEQ ID NO: 299: 

(i) SEQUENCE CflARACTERISTICS : 

;A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: rnicrosequencing oiigo 99-26.17-368 , mis2 

(B) LOCATION: 1,.19 

(Ki) SEQUENCE DESCRIPTION: SEQ ID NO: 299: 



AGATAJ^TGTG AGTGGGCCT 



(2} INFORMATION FOR SEQ ID NO: 300: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 
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(C) STRANDEDNESS: SINGLE 
{D) TOPOLOGY; LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) OUIGIKAL SOURCE: 

(A) ORGANISM: Homo r>opicns 

[!>:) KEATURE: 

{AJ NAME/KEY: potential microisoquoncitui oi\<]a ')')-26^''i" [01 \ 
(H) LOCATION: I.. 23 

{:<i] S!-:OUENCE DESCRIPTION: iU:Q ID NO: 300: 
AGTTTCAGTG CATTGCTGTC CTG 23 



(2) INFORMATION FOR 5EQ ID NO: 301: 

(i) GEOUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi ) ORIGINAL SOURCE; 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-34^ 

(B) LOCATION: 1..47 

(i:-:) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 2^ 

(D) OTHER INFORMATION: base a 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-344-misl 
CB) LOCATION: 1,,23 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-34^-mis2 

(B) LOCATION: complement 25,. 4 3 

(Ki) SEQUENCE DESCRIPTION: SEQ ID NO: 301: 
TGCTGCCAAG GATCCATGTC AGCATGCTCC TCTCTGAGCC CTGGTCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 302: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESn: SINGLE 
([)) TOPOLOGY: LINEAR 

MOLECULE TYPE: DNA 

{vi ) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(Lx) FEATURE: 

(A) NAME/KEY: f>olyinorptiic fracjinont 99- idb 

(B) LOCATION: 1..47 

(i>:) FEATURE: 

(A) NAME/KEY: polymorphic base 
{B) LOCATION: 24 

(D) OTHER INFORMATION: base t 

(ix) FEATURE: 

{A) NAME/KEY: microsequencing oiigo 99-36b-iuisl 
(H) LOCATION: 5.. 23 

(i:-:) FEATURE: 

(A) NAME/KEY: Potential microsequencmq oliqo 99- 36b-::iis2 

(B) LOCATION: complement 25.. 47 

(xL) :;E0UENCE description: SEQ id NO: 302: 
f'GGCCTGGC TTCAGGGACA GCTTAGGA.;i-.A TGTTTGTTGA GTTAGTG 4 7 



/;) information for SEQ ID NO: 303: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-359 

(B) LOCATION: 1..4 7 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g 
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(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-359-misl 

(B) LOCATION: 1_23 

FEATURE: 

(A) NAME/KEY: micro.'iequencing oligo j^n-m 
(H) LOCATION: complement 25., 43 

(xL) SEQUENCE DESCRIPTION: SEQ ID NO: 303: 
CTACAGAGTC ATCGCCTCCA TCCGGTCTCA ACAAATCCTG GCAGCTC Al 

{2) INFORMATION FOR SEQ ID NO; 30-1: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 7 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{]:-:) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-3r)S 
(U) LOCATION: 1..47 

(ix) FEATURE: * " 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 21 

(0) OTHER INFORMATION: base g 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oiigo 99-355-misl 
{B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-355-mis2 

(B) LOCATION: complement 25.. 43 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 304: 
GGAGTTTCGG GGAGTTTCGG GAGGGTTCCT GGGAAGAAGC TCCTCCC 4 7 

(2) INFORMATION FOR SEQ ID NO: 305: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 
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(B) TYPE: NUCLEIC ACID 
CO STRANDEDNESS: SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) HOLECULE TYPE: DNA 

(vL) ORIGINAL SOURCE: 

(A) ORGANISM: Homo i5apieru> 

{[:<) FEATURE: 

(A) NAME/KEY: polymorphic frjginont 90-36b 

(B) LOCATION; 1,,'ie 

(i:<) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 2n 

(D) OTHER INFORMATION: base c 

(ix) FEATURE: 

(A) NAME/KEY: luicrosequencing oiicjo 99-ib-j-:ni2 1 

(B) LOCATION: 5., 23 

(i:-:) FEATURE: 

(A) NAME/KEY: Potential microsequencmg oIi-jo 9<^-365-mid2 
(D) LOCATION: complement 25.. ^8 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 305: 



CCTAr:C/\J\GC AAGCAGCCCC AGCCTAGGGT CAGACAGGGT GAGCCTC ^7 



(2) IIIFORMATION FOR SEQ ID NO: 306: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: Al base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

{A) ORGANISM: Homo sapiens 

(iK) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2452 

(B) LOCATION: 1..47 

(Dl OTHER INFORMATION: Extracted from sequence gb:M10065 

(3909. , 3955) 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c 
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(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2452-misl 

{E^} LOCATION: 5.. 23 

{ix) FEATURE; 

(A) NAME/KEY: Potential microsequcncinq oliqo 99-2^52-mis2 

(B) LOCATION: complement 25.. 47 

(xi) GEQUENCE DEGCRIPTION: SEQ ID NO: 30 6: 



TC;CGCGCr:GA CATGGAGGAC GTGCGCGGCC GCCTGGTGCA CTACCGC 4 7 



(2) INFORMATION FOR GEQ ID NO: 307: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: Al base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{ix} FEATURE: 

(A) NAME/KEY: polymorphic fragment: 99-3^1*1 
iU) LOCATION: 1..4 7 

(D) OTtlER INFORMATION: variant version of SEQ ID301 

(ix) FEATURE: , ^ 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base g; a in SEQ ID303 
{ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-344-misl 

(B) LOCATION: 1..23 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-344-mis2 

(B) LOCATION: complement 25.. 43 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 307: 



TGCTGCCAAG GATCCATGTC AGCGTGCTCC TCTCTGAGCC CTGGTCT 4 7 



(2) INFORMATION FOR SEQ ID NO: 308: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 
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(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-3b6 

(B) LOCATION: 1, .^17 

(0) OTHER INFORMATION: variant vorsion oC l\W XL)J02 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base c; t in SEQ ID302 

(i;-:) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-366-misl 

(B) LOCATION: 5.. 23 

(IX) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-366-inis2 

(B) LOCATION: complement 25.. 'I? 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 308: 



AGGGCCTGGC TTCAGGGACA GCTCAGGAAA TGTTTGTTGA GTTAGTG /, 7 



(2) INFORMATION FOR SEQ ID NO: 309: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-359 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID303 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a; g in SEQ ID303 
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(ix) FEATURE: 

{A} NAME/KEY: Potential microsequertcing oiigo 99-359-misi 
{B) LOCATION: 1..23 

(i:<) FEATURE: 

{A} NAME/KEY: microsequcncing olicjo 99-35n-mi32 
(B) LOCATION: complement 25.. 43 

(xi) :n-.(;UENCE DECCRirTION: SEQ ID NO: 309: 
CTACAC'.AC.TC ATCGCCTCCA TCCAGTCTCA ACAAATCCTG GCAGCTC ^'^ 



(2) INFORMATION FOR SEQ ID NO: 310: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: Al base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

Cvi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragmont 99-355 

(B) LOCATION: 1,,^7 

(U) OTHER INFORMATION: variant version of SEQ I030.1 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base a; q in SEQ ID30^ 
(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo 99-355-misl 

(B) LOCATION: 1 . . 23 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-355-mis2 

(B) LOCATION; compleiuent 25.. 43 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 310: 
GGAGTTTCGG GGAGTTTCGG GAGAGTTCCT GGGAAGAAGC TCCTCCC 4 7 



(2) INFORMATION FOR SEQ ID NO: 311: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 8 base pairs 
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(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi ) ORIGINAL SOURCE: 

(A) ORGANISM: Homo .sapiens 

( Lx) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-365 
(U) LOCATION: 9 

(0) OTHER INFORMATION: v.-iriant version ot l-HO IDJ05 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 2A 

(D) OTHER INFORMATION: base t; c in SEQ ID335 

(ix) FEATURE: 

(A) NAME/KEY: microsequenci ng oligo 99-365-riisI 

(B) LOCATION: 5.. 23 

{ix) FEATURE: 

(A) NAME/KEY: Potential microsequencinvj oli,:o 99-365-mis2 

(B) LOCATION: complement 25.. 48 

[xi) SEQUENCE DESCRIPTION: SEQ ID NO: 311: 



CCTACCAAGC AAGCAGCCCC AGCTTAGGGT CAGACAGGGT GAGCCTC 



(2) INFORMATION FOR SEQ ID NO: 312: 

{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic fragment 99-2452 

(B) LOCATION: 1..47 

(D) OTHER INFORMATION: variant version of SEQ ID306 

(ix) FEATURE: 

(A) NAME/KEY: polymorphic base 

(B) LOCATION: 24 

(D) OTHER INFORMATION: base t; c in SEQ ID306 
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(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2^ 52-mis 1 

(B) LOCATION: 5. .23 

(ix) FEATURE: 

(A) NAME/KEY: Potential microsequencing oligo ')^)-2^ 52-m i.G2 
(I^) LOCATION: compiomont 25.. ^7 

{xi.} SEQUENCE DE.SCRIPTION: SZQ ID NO: 312: 



tcggcc;cc:ga catcgaggac gtgtgcggcc gcctggtgca gtaccgc: 



(2) INFORMATION FOR SEQ ID NO: 313: 

(i] 51E0fJENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
i'J) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amoli f ica t ion orimcr loi i;E0 IDjUI and SE*^ 

IDJ07 

ili) LOCATION: 1..20 
(:<i] SEQUENCE DESCRIPTION: SEQ ID NO: 313: 



gctctcatat tcattgggtg 



(2) INFORMATION FOR SEQ ID NO: 314: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 
{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



ID308 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID302 and SEQ 

(B) LOCATION: 1..18 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 'I : 



TCTCTCCCGT GTTAAATG 



{?.) rMTORMATION FOR SEQ ID NO: 315: 

(i) r.EQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

{10 TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
{0} TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DMA 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: flomo sapiens 

(1:-:) FEATURE: 

(A) NAME/KEY: upstream amplification oritaor Lor SEd ID303 arid SE'^ 

ID309 

(D) LOCATION: 1..18 

(:<i) SEQUENCE DESCRIPTION: SEQ ID NO: 315: 



AATCTTCTTG CTCCTGTC 



(2) INFORMATION FOR SEQ ID NO: 316: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID304 and SEQ 

ID310 

(B) LOCATION: 1..18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 316: 



AGGTTAGGGG TGTATTTC 



18 
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(2) INFORMATION FOR SEQ ID NO: 317: 

(i) SEQUENCE CHAMCTERISTICS : 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNEG^l: SINGLE 
{{)) TOPOLOGY: LINEAR 

(il) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: ilomo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEO ID205 and SKO 



(B) LOCATION: 1..18 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 317: 



AGACTGTGAC CTTAGACC 



{2) INFORMATION FOR SEQ ID NO: 318: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: upstream amplification primer for SEQ ID306 and SEQ 



ID3II 



ID312 



(3791. .3808) 



(B) LOCATION: 1..18 

(D) OTHER INFORMATION: Extracted from sequence gb:M10065 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 319: 



GACGAGACCA TGAAGGAG 



18 



(2) INFORMATION FOR SEQ ID NO: 319: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
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(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(il) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo tiapions 

(ix) FEATURE: 

(A) NAME/KEY: downstream nmplifica tion prir.or t^or SEO 10301 aru 

SEO ID307 

(B) LOCATION: l..i9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 319: 

TCGCTGCGGT TAGATGCTC ;^ 9 



(2) INFOt^MATION FOR SEQ ID NO: 320: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: .18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi.) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: downstream amplification pri.T.or for SEO ID302 and 

3EQ ID308 

(B) LOCATION: 1 . , 18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 320: 



AGGGGTAACT CTTGATTG 



(2) INFORMATION FOR SEQ ID NO: 321: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 18 base pairs 

{B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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{A} ORGANISM: Homo sapiens 
(ix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for GEO ID303 and 

SEO ID30 0 

(D) LOCATION; 1..18 
i^KOUENCE DESCRIPTION: SEQ ID NO: 321: 

ACCAAGGCAT AGCTTCTC IH 

{:'.) INFORMATION FOR SEQ ID NO: 322: 

(i} :JE0UE^3CE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) l.TRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(i:-:) FEATURE: 

(Al NAME/KEY: downstream amplification primor for 'S^ZQ ID30^. and 

r:EO ID'3iO 

(b) LOCATION: 1..18 
(:.:i) SEQUENCE DESCRIPTION: SEQ ID NO: 322: 

ATACAGCCAG GGAGATAG Ifi 

(2} INFORMATION FOR SEQ ID NO: 323: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 18 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) GTRANDEDNESS : SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

tix) FEATURE: 

(A) NAME/KEY: downstream amplification primer for SEQ ID305 and 

SEQ ID311 

(B) LOCATION: 1..18 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 323: 
AATTGCTACC CCCAATTC 



rNFORMATION FOR SEQ ID NO: 32^: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 haso pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

{ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Elomo sapiens 

(i:-:) FEATURE: 

(A) NAME/KEY: downstream amoli f ica t icn primor for SEO ID30G and 

SEO 10312 

(B) LOCATION: 1 . . 18 

(D) OTHER INFORMATION: Extracted from sequence cjb:MI0005 
{complement ^i37 0. ,4 395) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32 4: 



TCGAACCAGC TCTTGAGG 



Cn INFORMATION FOR SEQ ID NO: 325: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix] FEATURE: 

(A) NAME/KEY: potential microsequencing oliqo 99-344. misl 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 325: 



TGCTGCCAAG GATCCATGTC AGO 



23 
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(2) INFORMATION FOR SEQ ID NO: 326: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTI!: 19 bas»- pairs 

CB) TYPE: NUCLEIC ACID 

(C) STRANDEDNESn: SINGLE 

(0) TOrOLOGY: LINEAR 

(ii) MOLECULE TYPE: DMA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo t;apicn.s 

(ix) FEATURE: 

(A) NAME/KEY: rnicrosequencing oligo 9*^-366 . mis 1 

(B) LOCATION: 1, . 19 

(:-:i) SEQUENCE DESCRIPTION: SEQ ID NO: 326: 



CCTGGCTTCA GGGACAGCT 



(2) INFORMATION FOR SEQ ID NO: 327: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 
CO STRANDEDNESS : SINGLE 
(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential rnicrosequencing oligo 99-359. misl 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 327: 



CTACAGAGTC ATCGCCTCCA TCC 



(2) INFORMATION FOR SEQ ID NO: 328: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
IB) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 
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(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM; ilomo sapiens 

(ix) I-EATURE: 

(A) NAME/KEY: pat:ontiai micro;-:cqiionrinq oIi<]o ^J^)-!!!"") . mi:; I 

(B) LOCATION: i..23 

(xi) SEQUENCE DESCRimON; SEQ ID NO: 320: 

(;c;AGT'rTCGG ggagtttcgg gag ^ 



(2) INFORMATION FOR SEQ ID NO: 329: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsGquencing oiiqo ^)')- 3('/; . mi 1 

(B) LOCATION: 1..19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 329: 



CCAAGCAAGC AGCCCCAGC 



(2) INFORMATION FOR SEQ ID NO: 330: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 



(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-2452. misl 

(B) LOCATION: 1. .19 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 330: 



CGCGGACATG GAGGACGTG 



CM [NFOHHATION FOR SEQ ID NO: 331: 

(i] SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 baso pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 
(U) TOPOLOGY; LINEAR 

(it) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

{i:0 FEATURE: 

(A) NAME/KEY: inicrosequencing oligo 99-3-M,mis2 
CB) LOCATION: 1..19 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 331: 



CAGGGCTCAG AGAGGAGCA 



{?.) INFORMATION FOR SEQ ID NO: 332 : 

{i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 
{B) TYPE: NUCLEIC ACID 

(C] STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME/KEY: potential microsequencing oligo 99-366. inis2 

(B) LOCATION: 1. .23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 332: 



CACTAACTCA ACAAACATTT OCT 23 



(2) INFORMATION FOR SEQ ID NO: 333: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 
(D) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(Li) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: liomo i^.^pions 

(i>:) FEATURE: 

(A) NAME/KEY: rnicrosoquoncing oligo 99-35'KmiiJ2 

(B) LOCATION: 1..19 

(>:i) SEQUENCE DESCRIPTION: SEQ ID NO: 333: 



TGCCAGGATT TGTTGAGAC 



(2) INFORMATION FOR SEQ ID NO: 334: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens , ^ 

iix] FEATURE: 

(A) NAME/KEY: microsequencing oligo 99-355. mis2 

(B) LOCATION: 1,.19 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 334: 



GGAGCTTCTT CCCAGGAAC 



(2) INFORMATION FOR SEQ ID NO: 335: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 



(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: Homo sapiens 
(ix) FEATURE: 

{A) NAME/KEY: potential microsequencinq oligo 99-365. mis2 

(B) LOCATION: 1,.23 

(xi) 5E0UENCE DEGCRITTION: SEQ ID NO: 335: 



c;aggctcacc ctgtctgacc cta 



CM INFORMATION FOR SEO ID NO: 33C: 

{L) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 23 base pairs 

(B) TYPE: NUCLEIC ACID 

(C) STRANDEDNESS: SINGLE 

(D) TOPOLOGY: LINEAR 

(ii) MOLECULE TYPE: DNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(1:-:) FEATURE: 

(A) NAME/KEY: potential microsequencinq oiigo 99-2-1 52 . mis2 

(B) LOCATION: l.,23 

:;EQUENCE DESCRIPTION: SEQ ID NO: 33G: 



GCGGTACTGC ACCAGGCGGC CGC 



23 



